Abstract
A number of recent studies have found that temporary members of the United Nations Security Council (UNSC) experience increased foreign aid inflows. We use a constrained permutations approach to replicate analyses found in Vreeland and Dreher (2014). Permuting the timing of country membership on the Security Council, we create placebo UNSC membership histories which plausibly could have been observed. We use these placebos to construct a reference distribution for the null hypothesis that there is no relationship between UNSC membership and foreign aid flows and then observe whether or not the observed test statistic for the correlation found in the real-world data is in the tails of this distribution. In other contexts, such empirically based hypothesis tests have revealed a high false-positive rate for traditional, model-based time-series cross-sectional inference. Given the controversial nature of studies about increased aid flows as secondary benefits of UNSC membership, it is valuable to subject such analyses to additional scrutiny. Our reanalysis largely validates existing findings.
Keywords
Since the foundational article by McKinlay and Little (1977), the literature on foreign aid allocation has been dominated by studies juxtaposing “recipient need” and “donor interest” explanations (see, among others, Alesina and Dollar, 2000; Berthélemy, 2006). These studies include a set of aid-receiving-country variables purporting to proxy for either recipient need (e.g. gross domestic product (GDP) per capita, population, infant mortality rate) or donor self-interest (e.g. donor exports, military assistance, status as a former colony) in time-series cross-sectional regression models; the results are interpreted as indicating the dominance of either altruistic or selfish motivations on the part of donors.
While broad correlations between recipient-country characteristics and aid flows provide some insight into donor decision-making, this literature rarely describes the specific mechanisms by which variation in recipient country indicators translates into over-time variation in aid flows. A recent wave of aid allocation studies has turned to understanding how aid flows respond to specific geopolitical events. A number of these studies have focused on how the rotation of aid-receiving countries into the 10 non-permanent United Nations Security Council (UNSC) seats results in increased aid flows (Dreher et al., 2009a, 2009b; Kuziemko and Werker, 2006; Vreeland and Dreher, 2014).
Kuziemko and Werker (2006) find that US aid flows increase by 59% to countries rotating onto the Security Council. Vreeland and Dreher (2014) find that temporary UNSC members receive nearly twice as much aid as non-members from the five major bilateral donors; are more likely to participate in IMF programs with lower levels of policy conditionality; and are more likely to receive multilateral development bank projects. In sum, there is evidence that international development actors change their behavior toward countries that rotate into UNSC membership.
These studies argue that the institution of non-permanent member rotation provides leverage for causal identification. As Vreeland and Dreher (2014: 27) describe, “the idiosyncratic selection process [of non-permanent UNSC members] and the exogenously enforced term limits allow us to make controlled comparisons of (1) individual countries on and off the UNSC over time and (2) different countries on and off the UNSC during the same period of time.” Although countries are not randomly assigned to serve on the Security Council, the selection process is understood to such an extent and limited in such ways as to make statements about counterfactuals convincing.
Existing studies have relied on multivariate regression methods for time-series cross-sectional data in order to estimate the effect of UNSC membership. Characterizing uncertainty in the findings therefore has depended on assumptions about and adjustments to the standard errors estimated in those regressions. The baseline assumption of errors estimated by ordinary least squares regression is that the observations are independently and identically distributed. Clustering in the data means that the number of informative observations in the data is less than the raw N would suggest; unadjusted standard errors might be overoptimistic (Angrist and Pischke, 2009; Erikson et al., 2014). One solution—employed by existing studies of the UNSC–foreign aid link—is to calculate standard errors to account for clustering. In this paper, we test the robustness of this correction using constrained randomization tests.
Standard regression methods estimate a test statistic based on the ratio of an estimated coefficient to an estimated standard error. When the coefficient is sufficiently larger than the standard error, the test statistic appears in the tails of a theoretical probability distribution, and the analyst rejects the null hypothesis that the coefficient is equal to zero, deeming the estimate “statistically significant.” Placebo-based randomization tests create a reference distribution from the data itself, rather than relying on assumptions about the shape of disturbances in the data (Erikson et al., 2014). In our constrained randomization test, we create a reference distribution of test statistics based on 1000 plausible but invented Security Council. For each of these placebos, we estimate the relationship between membership and foreign aid flows using the replication dataset and modeling strategy from Vreeland and Dreher (2014). We then determine whether the test statistic produced by the original data is truly outlying within the distribution of statistics estimated using the placebos. The placebo-based method is less susceptible to false positives than traditional distribution-based tests because the placebo data should contain the real-world clustering and time dependence that might lead to deflated standard errors in traditional tests.
Our constrained randomization tests largely replicate the estimates of uncertainty from Vreeland and Dreher (2014). Using a non-parametric test of statistical significance, we demonstrate the robustness of their claims, providing additional evidence that there exists a meaningful correlation between temporary UNSC membership and increased foreign aid flows.
The logic linking foreign aid to UNSC membership
The idea that foreign aid is a tool of statecraft is well established. As Morgenthau said, “Much of what goes by the name of foreign aid today is in the nature of bribes … a price paid for political services rendered or to be rendered” (Morgenthau, 1962: 302). Bueno de Mesquita and Smith (2007, 2009) similarly describe foreign aid flows as quid pro quo transfers. Given the temporary elevation in geopolitical prominence of states elected to the Security Council, such states might be the target of additional aid. With regard to services being rendered by these temporary UNSC members in exchange for aid, Vreeland and Dreher (2014) propose that the most powerful countries in the world want to gain increased legitimacy for the actions that they take by winning UNSC votes by large margins.
While Vreeland and Dreher (2014) offer several examples of explicit quid pro quo exchanges (e.g. the United States cutting foreign aid to Yemen in 1990 after it voted against UNSC Resolution 678 to authorize the use of force against Iraq), they suggest that increases in foreign aid to non-permanent UNSC members result from a “complex web” of connections among ambassadors and government agencies. Because of a high-level desire to court the favor of temporary UNSC members, foreign aid to these countries is more likely to be discussed within national aid agencies and multilateral institutions. The quid pro quo exchange may be more implicit than explicit.
Beyond the theoretical arguments for why UNSC membership might lead to increased foreign aid flows, this explanatory variable has characteristics that are desirable from the standpoint of causal inference. Rotation on and off the Security Council allows us to make predictions about precise windows during which treatment effects should be observed. While service on the Security Council is not random, the timing of service on the Security Council is demonstrably orthogonal to other determinants of aid flows, such as a state’s level of economic development or the amounts of aid that it has received in previous time periods (Bueno de Mesquita and Smith, 2010; Vreeland and Dreher, 2014). In fact, the most important determinant of service seems to be the amount of time since a country last served on the Council (i.e. there exists a norm of turn-taking, particularly among African countries; Vreeland and Dreher, 2014). Vreeland and Dreher (2014: 135) therefore describe temporary UNSC membership as something that can be considered “a chance event.”
Existing methods
Although Vreeland and Dreher (2014) assert the exogeneity of UNSC membership, their main analyses rely on multivariate linear regression models controlling for background characteristics that plausibly affect both election to the Security Council and foreign aid flows. Specifically, they control for whether a country—in a given year—is a pariah state or at war, and also for the country’s GDP per capita, regime type, and level of incoming military assistance from the United States. To control for other time-invariant country characteristics, they include country fixed effects. They also include year fixed effects and a region-specific quartic time trend. A similar strategy is employed in Kuziemko and Werker (2006) and Dreher et al. (2009a, 2009b). In a study of the effects of UNSC membership on economic growth, democracy, and press freedom, Bueno de Mesquita and Smith (2010) use a nearest-neighbor matching analysis to identify the set of counterfactual non-UNSC-member cases that are most similar to temporary members in a given year in terms of population, GDP per capita, and level of democracy.
The constrained permutation method
Building on Reynolds (2010, 2014), we use a constrained permutation method to construct a randomization test. We compare the test statistic for the observed correlation between UNSC membership and foreign aid flows to a distribution of test statistics from correlations estimated using placebo Security Council. This process yields an empirically derived p-value that we compare to those in Vreeland and Dreher (2014).
Thinking about alternative UNSC membership profiles that realistically might have been observed, we know that exactly 10 countries will be on the Security Council each year, and these countries obviously must be members of the United Nations. 1 We know that these countries will come from five different regional groups. 2 And we know that treatment for each country will start and stop at a particular moment in time and that, given the rule against reelection, no country will receive consecutive treatments. Finally, we recognize that some countries appear on the Security Council with greater frequency than others.
Thus, our constrained permutation approach creates placebo Security Councils based on (a) membership rules and (b) the observed frequency of UNSC participation. We assign countries to terms on our counterfactual Security Councils based on the number of times that they have historically served on the Security Council (since 1946), while respecting the regional distribution and the prohibition on back-to-back terms. Brazil, for instance, has served on the Security Council 10 times. Therefore, in each of our placebo UNSC histories, Brazil appears on the Security Council 10 times. For each permutation, however, the specific terms for which Brazil has tenure on the Council have been independently assigned. 3
We also constrain our placebos to account for two temporal variations in the operation of elections to the Security Council. First, regional groupings changed in 1966. Our permutations, therefore, create hypothetical Security Councils for the period 1946–1965 that draw on the five regional groupings that existed over that time period and hypothetical Security Councils for the period 1966–2014 that draw on the revised groupings. Second, there have been 15 instances in which countries have served only a single-year term. Three of these were on the original 1946 Council, and the remaining 12 are spread over the decade from 1956 to 1966. In our permutations, these single-year terms are constrained to be single-year terms in historical time, and a country that is assigned to one of the relevant terms (e.g. the slot that was actually held by the Philippines in 1963) serves only a single year on that placebo Security Council rather than the standard two years. Countries are assigned only to a tenure held by a country from their same regional grouping and are assigned only in the years during which they were UN member states. 4 As a general rule of thumb, we would want someone knowledgeable about the Security Council to have difficulty identifying which Security Councils actually existed and which are placebos.
Other scholars have used permutation in a less constrained way. Erikson et al. (2014) scramble at random the identities associated with countries for the independent variable of interest in order to create a reference distribution in a time-series cross sectional setting. In the context of UNSC compositions, we are less comfortable with such an unconstrained permutation because such a procedure would result in implausible compositions. For example, an unconstrained permutation might substitute Palau’s Security Council profile for Italy’s. Such a move might contribute to observing an incorrect number of states from the regional groups or to countries being assigned an unrepresentative number of times: Palau has never served on the Security Council while Italy is a relatively frequent member. Our comparisons therefore are constrained in the sense that permutations are plausible given what we know about the history of UNSC elections.
We merge 1000 placebo UNSC histories with the data from Vreeland and Dreher (2014), which covers the period 1960–2009. The results that we present below therefore exclude the parts of the placebo histories that are outside of this period. In Chapter 5 of their book, Vreeland and Dreher (2014) present results from more than 50 multivariate linear regression models of foreign aid. There are four types of models, each of which is estimated across 13 different donors: the largest five bilateral donors and eight multilateral donors. 5 The first set of regressions model aid to 125 countries as a function of an indicator variable for temporary UNSC membership. The authors’ estimate for the UNSC indicator is statistically significant for Japan, Germany, the International Bank for Reconstruction and Development (IBRD), and the UN High Commissioner for Refugees (UNHCR). 6 The second set of regressions is identical except that the authors interact the UNSC indicator with indicators of UNSC activity in a given year, classifying each year as unimportant, somewhat important, or important. In these models, Vreeland and Dreher (2014) find statistically significant correlations between UNSC membership during important years and aid flows from the United States, Germany, and the IBRD. 7 The third set of models are estimated for 49 African recipient countries and the authors find that UNSC membership significantly predicts increased aid flows to these countries from Japan, the IBRD, the International Development Association (IDA), the United Nations in general, the World Food Programme (WFP), United Nations Children’s Fund (UNICEF), and the UNHCR. 8 Finally, the fourth set of regressions model aid to African recipient countries as a function of serving on the Security Council during important years or not. Vreeland and Dreher (2014) find significant correlations between UNSC membership by African recipients in important years and aid flows from the United States, the IBRD, the WFP, and the UNHCR. 9
We replicate the regression models just described using the set of 1000 placebo Security Councils. For each regression model that we estimate, we use the same specification as Vreeland and Dreher (2014) and retain the estimated t-statistic on the variable of interest (UNSC membership in the first two sets of regressions and UNSC membership during important years in the latter two sets of regressions). The t-statistic on the coefficient of interest is a “pivotal statistic” because it is adjusted for correlation among the other variables included in the regression (Erikson et al., 2014).
We create reference distributions and generate randomization-based two-tailed p-values by doubling the proportion of test statistics in the reference distribution which are more extreme than the observed t-statistic produced by the statistical model of the real world data. 10 This number tells us whether or not that original t-statistic was outlying based not on the Student’s t-distribution but instead on an empirical distribution of t-statistics produced by analyzing the patterns found in the placebo data.
Results
As can be seen in Tables 1–4, our randomization-based p-values are almost always quite close to the distribution-based p-values reported in Vreeland and Dreher (2014). In some cases, they suggest slightly more precision (e.g. for Japanese and German aid flows in Table 1), and in some cases they suggest slightly less precision (e.g. for US and UNICEF aid flows in Table 1). 11 But generally speaking, the randomization-based methods that we employ validate the levels of statistical significance reported in Vreeland and Dreher (2014).
Global results—all UNSC years.
IBRD: International Bank for Reconstruction and Development; IDA: International Development Association; UN: United Nations; UNDP: United Nations Development Programme UNICEF: United Nations Children’s Fund; UNHCR: United Nations High Commissioner for Refugees; UNTA: United Nations Regular Programme for Technical Assistance; V&D: Vreeland and Dreher; WFP: World Food Programme.
Global results—important UNSC years.
IBRD: International Bank for Reconstruction and Development; IDA: International Development Association; UN: United Nations; UNDP: United Nations Development Programme UNICEF: United Nations Children’s Fund; UNHCR: United Nations High Commissioner for Refugees; UNTA: United Nations Regular Programme for Technical Assistance; V&D: Vreeland and Dreher; WFP: World Food Programme.
Africa results—all UNSC years.
IBRD: International Bank for Reconstruction and Development; IDA: International Development Association; UN: United Nations; UNDP: United Nations Development Programme UNICEF: United Nations Children’s Fund; UNHCR: United Nations High Commissioner for Refugees; UNTA: United Nations Regular Programme for Technical Assistance; V&D: Vreeland and Dreher; WFP: World Food Programme.
Africa results—important UNSC years.
IBRD: International Bank for Reconstruction and Development; IDA: International Development Association; UN: United Nations; UNDP: United Nations Development Programme UNICEF: United Nations Children’s Fund; UNHCR: United Nations High Commissioner for Refugees; UNTA: United Nations Regular Programme for Technical Assistance; V&D: Vreeland and Dreher; WFP: World Food Programme.
The model for which we see a relatively substantial change in the p-value linked to a regression coefficient that Vreeland and Dreher (2014) consider to be statistically significant is found in Table 2. The randomization p-value reported for the coefficient on the relationship between UNSC membership in important years and aid flows from the United States is more than five times the distribution-based p-value reported in Vreeland and Dreher’s Table 5.1: p < 0.33 as compared to p < 0.06. While Vreeland and Dreher (2014) note that their estimate has a wide 90% confidence interval and report that dropping certain control variables from the model increases the size of the t-statistic, our results suggest that we should remain skeptical about the extent to which US aid flows to temporary members of the Security Council increase in important years.
Figures 1 and 2 provide insight into how the permutation-based tests work. In each figure, we plot a theoretical t-distribution and the t-statistic from the real-world data. We then overlay the reference distribution created from the t-statistics produced using data from the 1000 placebo Security Councils, and we report the median t-statistic observed in this reference distribution. Figure 1 draws on the analyses of US aid flows; Figure 2 on analyses of Japanese aid flows.

Distributions of test statistics for models of US aid. The solid line is a kernel smoothed estimate of the distribution of t-statistics estimated from the permutations and depicted in the histogram.

Distributions of test statistics for models of Japanese aid. The solid line is a kernel smoothed estimate of the distribution of t-statistics estimated from the permutations and depicted in the histogram.
In the plot in the lower left-hand corner of Figure 1 (reflecting the models using the African subset and studying all years) and across three of the four plots in Figure 2, we see a very strong incidence between the Student’s t-distribution and the randomization-based distribution. In the remaining three plots in Figure 1 and in the fourth plot in Figure 2, the coincidence of the theoretical and randomization-based distributions is somewhat less. For the US data, in particular, this non-coincidence of our placebo-based reference distribution and the standard t-distribution suggests that there are aspects of the data that make standard significance tests inappropriate.
As the first rows of Tables 2 and 3—corresponding to the two plots in the right-hand column of Figure 1—make clear, the level of uncertainty in the findings for the United States is somewhat greater than what conventional tests indicate. The observed t-statistic of 1.90 from the real UNSC data is not located as deeply in the tail of the placebo-based reference distribution as in the tail of the theoretical t-distribution. This more central location implies a randomization p-value of p < 0.33, more than five times as large as the p-value of p < 0.06 calculated using the theoretical t-distribution. In Figure 2, on the other hand, it is easy to see how the observed t-statistics from the regressions of Japanese aid on real-world UNSC membership are in the tails for the plots related to Japanese aid flows to all countries and to African countries, mirroring the small p-values presented in Tables 1 and 3.
Conclusions
Previous work has pointed out that p-values produced by standard statistical analysis of complicated time-series cross-sectional data might not accurately reflect the level of uncertainty in coefficient estimates (Erikson et al., 2010, 2014). One method for addressing this and creating more accurate statistical significance estimates is to create a reference distribution for a test statistic based on a set of counterfactual placebos generated by permuting the existing data. Such permutation-based tests are popular in the experimental literature (Gerber and Green, 2012) and have been increasingly applied to observational data.
In this paper, we have replicated a substantial portion of the analysis from Chapter 5 of Vreeland and Dreher’s (2014) book on how foreign aid flows respond to temporary membership on the Security Council. We largely reproduce the statistical uncertainty estimates that those authors obtain using clustered standard errors in a multivariate regression. We have provided additional validation of the inferential claims that the authors make.
In doing so, we have made use of constrained permutations. As also applied in Reynolds (2010, 2014), this type of analysis creates a placebo test using a simulation process that mirrors as closely as possible the data-generating process. In our case, we have constrained our hypothetical Security Council to follow the regional distribution of temporary UNSC members and to respect the prohibition of consecutive terms for temporary UNSC members. Just as permutation-based tests in the experimental literature make use of information that the analyst has about the way in which the randomization of treatment occurred, our constrained permutation tests will be most plausible where the analyst has extensive knowledge about the data-generating process. When this is the case, we advocate using these tests as a compliment to existing methods.
While the results reported here do not deviate significantly from the original results, we might expect to see more divergence in situations where the structure of the data is less amenable to conventional statistical solutions such as clustered standard errors (e.g. in the study of dyadic data). In those cases, where there is unusual clustering or unmodeled time-dependence in the data, a constrained permutation-based test may be more likely to reveal unacceptable false positive rates for conventional statistical tests.
Footnotes
Acknowledgements
We thank James Vreeland and Axel Dreher for sharing their replication data with us. A previous version of this paper was presented at the 2015 Midwest Political Science Association Annual Meeting. We also thank Faisal Ahmed, Jeff Harden, and two anonymous reviewers for useful comments and Brian Gaines, Kelly Rader, and Paul Testa for useful conversations.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
