Abstract
In their 2021 article published in the British Journal of Political Science, Calla Hummel, John Gerring, and Thomas Burt (henceforth Hummel et al.) examine the relationship between political finance reforms and corruption, concluding that state funding of political parties reduces corruption by diminishing reliance on private financing. Our replication study challenges this conclusion based on data accuracy and operationalization issues of their key explanatory variable – Political Finance Subsidy Index (PFSI) – and using alternative model specifications and regional subsamples. While our partial data corrections and re-coding do not significantly differ from the Hummel et al. findings, the substantial regional heterogeneity regarding public party funding – corruption relationship and the results’ sensitivity to alternative modelling choices such as two-way fixed effects, country fixed effects, and time fixed effects, cast doubt on the reliability of their results. Our findings underscore the methodological challenges in assessing the effects of public funding reforms and emphasize the need for refined measurements and careful consideration of regional contexts. Although public funding may help combat corruption under specific institutional conditions, its effects are not universally consistent, calling for greater caution in deriving policy implications from cross-national analyses.
Introduction
While the relationship between political financing and corruption has already been cross-nationally examined, in some cases signalling that political finance regulation is rather ineffective in preventing corruption (Casal Bértoa et al., 2014; Fazekas and Cingolani, 2017; Lopez et al., 2017), Hummel et al. (2021) (henceforth Hummel et al.) article represents, for its geographical (175 countries) and temporal (115 years, 12,380 country-year observations) coverage, as well as methodological sophistication, the most ambitious attempt to estimate the effect of political financing reforms on political corruption. Hummel et al. underscore that their study covers a longer period than existing alternatives such as the Political Finance Dataset maintained by the International Institute for Democracy and Electoral Assistance (International IDEA, 2022) or the Political Party Database (Poguntke et al., 2020). Likewise, Hummel et al. assert that their measure of political financing is superior to another one from the V-DEM project (Coppedge et al., 2017) since ‘unlike the V-Dem measure, [theirs] is based on a factual coding of laws and reports rather than expert perceptions’ (Hummel et al., 2021: 873). Moreover, Hummel et al. build on the International IDEA approach to track political finance regulations using primary sources and various ‘supranational reports, election observer reports, non-governmental organization (NGO) reports and academic articles’ (Hummel et al., 2021: 873). Finally, when the information from these sources was ambiguous, country experts were consulted.
Despite these impressive efforts, our replication reveals a more complex relationship between political finance subsidies and corruption. While our analysis reveals several data coding issues in the original study, the most striking findings emerge from the results’ sensitivity to different modelling approaches and the substantial regional variation in the relationship between state subsidies and corruption. Our analysis uncovers that two-way fixed-effects models, country fixed-effects models, and year fixed-effects models produce markedly different results, suggesting that the relationship between public funding and corruption is more intricate than the original study implies, as evidenced by existing research (Lipcean and Casal Bértoa, 2024). Specifically, while two-way fixed-effects models indicate a negative relationship, country fixed-effects models show no such association, while year fixed-effects models yield inconsistent results. Moreover, this relationship varies substantially across different world regions, challenging the original study’s conclusions. This paper proceeds in two steps. We first analyse the limitations of the Political Finance Subsidies Index (PFSI), then replicate the original study using alternative model specifications and data re-coding. Our results challenge Hummel et al.’s findings, as their conclusions do not hold when using regional samples and alternative model specifications.
Operationalization of PFSI
Hummel et al.’s PFSI is a composite measure of five binary indicators: de jure subsidies for (1) statutory and (2) electoral activities; de facto subsidies for (3) statutory and (4) electoral operations; (5) share of public versus private funding in party income (Hummel et al., 2021: 874) (See Table A1 in the online appendix for details). We identify five issues with this operationalization of direct public funding (DPF): (1) redundancy of two of the PFSI components, (2) deficient operationalization of the ‘campaign subsidies’ component, (3) flawed coding of ‘majority public money’ component, (4) measuring PFSI as a stock variable, and (5) data accuracy.
Redundancy of PFSI components
Firstly, the PFSI duplicates two components related to electoral and organizational funding since both campaign and party subsidies are coded as de jure and de facto, therefore they are redundant. We are aware of possible instances of legally stipulated subsidies that, for different reasons, are not implemented. While such instances would justify the coding of de jure and de facto subsidies into separate components of PFSI, we expect a relatively low frequency of such cases. Unlike other dimensions of political funding regulations like bans and limits on political contributions and campaign spending or transparency obligations that impose a heavy regulatory burden on party finances, public funding bears a different nature: it helps to cope with the financial costs and alleviate their fundraising burden. Consequently, political parties have a vested interest in subsidies’ disbursement since they ensure the continuity and stability of their activities.
A correlation test shows that both de jure and de facto components of party electoral and organizational subsidies overlap considerably. The score between de jure and de facto electoral subsidies is 0.84, while between de jure and de facto organizational subsidies is 0.88, which largely supports our argument about the redundancy of these components. A better solution would have been to rely solely on the de facto component for both organizational and election subsidies, thus ignoring the de jure component when the law was not implemented. After all, why would one expect subsidies to deter corruption if they are foreseen by law, but not actually distributed to parties? Ultimately, this is the key theoretical argument regarding the discouraging effect of subsidies on corruption (Hummel et al., 2021: 878).
Operationalization of ‘campaign subsidies’
The coding of de jure and de facto campaign subsidies in the same way as for party statutory funding raises even more concerns. These concerns stem from the fact that, unlike organizational subsidies disbursed annually, electoral subsidies are dispensed solely in election years either before elections as a lump sum for campaigning, or after elections as reimbursement of campaign expenses. Either way, campaign-related subsidies represent a ‘one-shot game’ until the next contest. Nonetheless, Hummel et al. code electoral subsidies on an annual basis. Given the average frequency of elections (i.e. once every 4 years), approximately 75% of data for countries providing electoral subsidies are miscoded. This represents a significant share of miscoded data, even if we were to assume that elections are organized more often. 1 Finally, even assuming that past and future public funding for elections is expected to deter corruption, its effect in non-election years should be lower relative to election years and coded accordingly using a specific depreciation rate for the election-related subsidy indicator in non-election years.
Coding of the ‘majority public money’ component
As acknowledged by Hummel et al. in the Variables and Coding section of the Online Appendix, the PFSI’s last component (the majority of funding from public sources) ‘is the variable with the shakiest coding’ given the unavailability of data for most countries on the proportion of private versus public funding in party budgets. To tackle this problem, Hummel et al. propose a conservative solution, namely, to code all cases as minority public funding unless there is ‘a source that stated otherwise’. This results in ‘false negatives, i.e. countries that have majority public funding coded as minority public funding’ (Hummel et al., Online Appendix).
Yet this solution distorts PFSI scores in various ways. Firstly, it represents an additional source of duplication for polities offering subsidies since they were already coded twice for de jure and de facto. Secondly, such coding might conceal significant variation in the ‘majority’ (50 + 1% − 100%) and ‘minority’ (0%–50%) subsidy share in party budgets. We illustrate how this may increase uncertainty by crossing data on minority versus majority status from Hummel et al. dataset with more precise information on public funding from other studies. This data reflects the share of state subsidies in the structure of party income and is based on party financial reports (Biezen and Kopecký, 2017: 87, 90; Casal Bértoa et al., 2014: 374–375; Casas-Zamora, 2005: 49; Nassmacher, 2009: 143). As Figure 1 shows, there is an overlap between ‘majority’ and ‘minority’ status cases. Relationship between minority versus majority status in Hummel et al. data and the share of subsidies in party income from alternative studies. Sources: own elaboration based on data from Biezen and Kopecký (2017: 83, 90), Casal Bértoa et al. (2014: 374), Casas-Zamora (2005: 49), Nassmacher (2009: 143), and Hummel et al. (2021).
PFSI as a stock variable
Another concern regarding the PFSI’s validity regards its operationalization as a ‘stock variable’ (Hummel et al., 2021: 880). Accordingly, PFSI represents a cumulative measure by combining its scores from previous years with the current year’s value. Moreover, the raw PFSI is log-transformed and subject to a depreciation rate conveying the idea ‘that the impact of political finance subsidies accrues over time but with diminishing marginal returns’ (Hummel et al., 2021: 880). Yet both the diminishing marginal returns and stock approaches are questionable for several reasons.
Firstly, it is unclear how one can apply the marginal returns approach to a proxy measure that does not capture the actual level of subsidies. One can apply the diminishing marginal returns framework when DPF reflects a certain amount of funding, say €3 per vote. Hence, one may assume that the second and third euros have diminishing marginal returns in fighting corruption relative to the first one. Our reasoning is based on the analogy with campaign spending where the diminishing marginal returns framework is well established and used to assess the effect of campaign spending on electoral outcomes, which implies that additional spending has a lower impact on gaining votes (Bekkouche et al., 2022; Jacobson, 1990; Stratmann, 2006). However, Hummel et al. do not use the amount of subsidies but rely on a regulatory proxy that is hardly amenable to this approach.
Secondly, it is problematic to justify why PFSI requires cumulation over time. Usually, state subsidies are disbursed on an annual basis and cannot be accumulated over the years as political parties are obliged to spend those funds annually. At best, they could accumulate subsidies only over a single electoral cycle. Therefore, constructing a cumulative index constitutes a considerable distortion of DPF operationalization.
Thirdly, and more importantly, there is a compelling reason why employing a stock measure is dangerous. Consider a scenario where two countries decided to introduce subsidies for party statutory activities to combat political corruption. 2 One provides €0.10 per vote and keeps it at the same level over 20 years, while the other introduces public funding 10 years later but sets it at €2 per vote for the next decade. According to Hummel et al., after 20 years the first country – by cumulating de jure and de facto subsidies – will score 40 PFSI points, while the second country – only 20. However, the cumulative amount of funding per vote accrued by political parties in the first country represents just one-tenth (€2) compared to the second (€20). The fact that the stock approach to operationalization is the most problematic and creates distortions is confirmed by Figures A1 and A2 in the online appendix. They reveal that it has the lowest correlation score with public funding measures from other studies compared to alternative operationalizations of public funding in Hummel et al. and those re-coded by us.
Of course, there are other factors affecting the supply and demand of political funds (e.g. donations, spending limits) that might interact with the supply and demand of subsidies and, consequently, impact their effectiveness in discouraging corruption. However, Hummel et al. focus exclusively on public funding without considering other elements of the political financing regime and, as the example above shows, the stock approach may generate significant distortions.
Data accuracy
Besides operationalization issues, data accuracy is another problem. Despite Hummel et al.’s reliance on diverse sources, expert input and a detailed description of the data sources in the online appendix, coding inaccuracies are rather concerning. While we acknowledge that data collection on state subsidies is difficult, there are multiple miscoded cases even in relatively well-documented countries. We provide here a few examples from the post-communist space due to (1) the availability of relatively high-quality subsidy data and (2) how it affects Hummel et al.’s findings in terms of statistical significance and effect size.
3
Accordingly, we scrutinized the coding of 27 post-communist polities from Hummel et al. dataset for both party organizational and election-related financing and matched it with actual DPF data on party statutory and campaign financing from Lipcean (2021) for the 1990–2020 period. It revealed extensive miscoding for various periods in approximately two-thirds of these countries.
4
Figure 2 illustrates the miscoding of all post-communist regimes by displaying the relationship between the actual DPF per vote and the coding of de facto statutory and electoral subsidies in Hummel et al. dataset. All cases that are coded by Hummel et al. as ‘No’ in both panels but are above zero on the y-axis are miscoded country-year observations in Hummel et al. data.
5
This mismatch raises concerns for data accuracy if one aims to address such a critical research question for democratic governance. Relationship between de facto amount of party organizational and electoral subsidies per vote and their coding in Hummel et al. data.
Replication results
The shortcomings identified above raise legitimate concerns over the PFSI’s validity. These concerns, in turn, raise the question about the reliability of Hummel et al.’s findings. Although Hummel et al. provide additional robustness tests using alternative operationalizations of PFSI in the Online Appendix, they nevertheless select the least defensible one for the analyses presented in the article. While the full replication of results based on our critique of PFSI lies outside the scope of this research, we nevertheless conduct several analyses to assess the sensitivity of the results by considering some objections against PFSI. 6
Accordingly, we replicate the results using the original PFSI from Hummel et al.’s article and PFSI operationalized as an aggregate additive index. Additionally, we employ two alternative PFSI operationalizations: (1) PFSI as an additive measure but removing duplicated de jure party and campaign funding components; (2) the same PFSI as above but with re-coded data on the de facto party organizational and campaign subsidies for the post-communist regimes. We fit the same model specifications as Hummel et al. (Model 3 in Table 3): OLS with country- and year-fixed effects with cluster robust standard errors (Hummel et al., 2021: 882). When using the same model specifications, our results are identical to Hummel et al., including models with an additive version of PFSI presented as robustness tests in the appendix. Additionally, we analyse two samples: the full (1900–2015) and a restricted sample (1960–2015), following Hummel et al.’s model specification from Table C13 in their online appendix. The restricted sample aims to capture the differences in the effect of PFSI after more countries introduced subsidies compared to the 1900–1960 period. Hence, it provides a more robust test of the relationship between PFSI and corruption.
Furthermore, we extend the analyses by exploring the PFSI-corruption relationship on regionally split data. Given the temporal coverage of Hummel et al. study, this is a critical issue since the introduction of DPF might vary conditional on political development. Additionally, policy emulation and diffusion might be stronger within the same region than across them, which may affect the strength and direction of the PFSI-corruption relationship. Finally, we decompose the two-way fixed-effects model into separate unit/country and time/year fixed-effects analyses and compare the PFSI estimates to those from the two-way fixed-effects model. This approach allows us to address recent critiques of two-way fixed-effects models, which argue that they ‘unhelpfully combine within-unit and cross-sectional variation in a way that produces un-interpretable answers’ (Kropko and Kubinec, 2020: 1). Moreover, our decomposition and regional analyses help mitigate potential biases associated with two-way fixed-effects estimators discussed in econometric literature. First, the two-way fixed-effects estimator can produce biased estimates when treatment effects vary over time or across units (De Chaisemartin and D’Haultfœuille, 2020), which raises a justified concern regarding the heterogeneity of PFSI’s impact on corruption across different political and institutional settings. Second, in settings with staggered treatment adoption – as is the case with the implementation of public funding across different countries – two-way fixed-effects can assign negative weights to some treatment effects, potentially leading to biased estimates (Goodman-Bacon, 2021). This is particularly problematic since the adoption of public funding occurred at different times across countries. Third, two-way fixed-effects models often rely on what has been labelled as ‘forbidden comparisons’, where already-treated units are implicitly used as controls for newly treated units (Borusyak et al., 2021). This might bias the estimates since countries that adopted public funding earlier are used as controls for later adopters, potentially conflating the effects of subsidies with other time-varying factors. Fourth, if the treatment effect varies over time, two-way fixed-effects may not capture these dynamics accurately, potentially leading to biased estimates (Sun and Abraham, 2021). By examining regional variations and separating unit- and time-fixed effects, one may better account for treatment effect dynamics. Finally, this approach helps to reduce aggregation bias, a common issue when effects are heterogeneous across units or periods (Callaway and Sant’Anna, 2021). Accordingly, by conducting regionally split analyses, we can identify whether the PFSI-corruption relationship varies regionally, potentially revealing important contextual factors that influence this relationship. This disaggregated approach helps to address the critique that two-way fixed-effects models may produce ‘un-interpretable answers’ by conflating within-unit and cross-sectional variation (Kropko and Kubinec, 2020).
Hence, we investigate whether the results using alternative operationalizations of PFSI and different model specifications diverge from those in Hummel et al. study. Although we fundamentally disagree with their approach to operationalizing public funding given its shortcomings, in this way, we can assess the robustness of their findings based on regionally split data and employing country and year fixed-effects models. Moreover, our re-coded data for post-communist polities is especially helpful in assessing the sensitivity of results. Yet, a potential effect will be visible only in the global and post-communist data samples.
Figures 3 and 4 present the results for the PFSI estimates based on models including the full set of covariates presented in online supplementary materials (Figure 3-Tables B1-B8, Figure 4-Tables C1-C8). Along with the ‘PFSI Stock’, which is the main explanatory variable in Hummel et al. study (Model 3, Table 3), and the ‘PFSI additive’ used for robustness in the online appendix (Table C6), we use two other measures: the ‘PFSI additive: No duplicates’ and ‘PFSI additive: No duplicates & re-coded’ that are restricted versions of PFSI after removing the duplicates and re-coding post-communist regimes, respectively. The last two measures, although still suboptimal, are more valid indicators of public funding than those used by Hummel et al. As the results for the ‘Global’ sample in Panel A, Figure 3 reveal, alternative operationalizations of PFSI affect the magnitude of point estimates relative to benchmark – PFSI stock. However, Z-tests indicate no statistically significant difference between PFSI coefficients from alternative model specifications (Table A3, Online Appendix). Two-way fixed-effects on the relationship between PFSI and corruption globally and by region. Two-way, country and year fixed-effects on the relationship between PFSI and corruption globally and by region.

Nevertheless, the results from Panel B are more conservative for the global sample given that two PFSI estimates (‘PFSI additive’ and ‘PFSI additive: No duplicates’) reach statistical significance only at a 0.1 confidence level. Since more countries introduced public funding of parties in the post-1960 period, these findings indicate the weakness of PFSI to deter corruption relative to the full sample. The fact that the results are ‘back to normal’ in terms of statistical significance, once we correct for the post-communist regimes re-coding, only underlines our concerns regarding PFSI validity and data accuracy. Crucially, based on the results from regional samples, one can notice that the robustness of findings for the global sample is driven by two regions: Middle East and North Africa (MENA) and, to a lesser extent, Eastern Europe and Central Asia since the negative relationship is statistically robust at a 0.05 significance level only for the stock-based PFSI – the primary target of our critique. More importantly, regardless of the PFSI version used, there is no evidence that PFSI is negatively associated with corruption in the other four regions. The results are similar for both full (Panel A) and restricted (Panel B) samples.
The results from Figure 4 cast even more doubts on the PFSI’s negative effect on corruption given the divergence between the results from two-way fixed-effects and those from country and year fixed-effects models. While the estimates from the country fixed-effects hover around zero and are not statistically robust and those from year fixed-effects are positive and robust when using the ‘PFSI stock’, those from two-way fixed-effects are negatively associated with corruption. This pattern holds for both the full and restricted samples. When examining regional data in the upper panel, considerable variation emerges but it is mostly related to time fixed-effects models. In contrast none of the country fixed-effects models, usually considered better choices for omitted variable bias issues than time fixed-effects alternatives, is robust at 0.05 level.
The results from the bottom panel that uses the re-coded PFSI show a similar pattern with slightly improved results for the time fixed-effects models, mainly driven by the re-coding of post-communist regimes. Nevertheless, they still display significant regional variability with even more uncertainty around the estimates. Once again, country fixed-effects models, capturing the within-case change in PFSI over time on corruption, display no effect, except in the Middle East and North Africa. Although time fixed-effects estimates are larger and more robust for the Middle East & North Africa, Eastern Europe & Central Asia, and Sub-Saharan Africa, they derive from the cross-sectional variation in PFSI, which makes them more susceptible to the standard critique of cross-sectional data that would allow drawing robust inferences. Interestingly, the findings from Latin America show no support for the PFSI-corruption relationship. These findings are supported by other research focusing on Latin America that employed more general regulatory-based measures of political financing (Lopez et al., 2017), and contrast starkly with the case study of Paraguay, selected by Hummel et al. to showcase the underlying mechanism of how state funding deters party corruption (Hummel et al., 2021: 875–880).
Overall, these results point to an inconsistent relationship between PFSI and corruption globally and are sensitive to model specification and PFSI operationalization. Despite considerable regional variation and a few exceptions, findings from regional samples also point to an inconclusive relationship between PFSI and corruption. Crucially, the discrepancies observed between the two-way fixed-effects estimates and those from country and year fixed-effects models appear to substantiate the concerns raised by econometric literature. The global negative two-way fixed-effects estimate contrasts sharply with diverse and often insignificant regional estimates, highlighting potential issues of heterogeneous treatment effects (de Chaisemartin and D’Haultfœuille, 2020) and aggregation bias (Callaway and Sant’Anna, 2021). The mixed (positive and negative) regional results suggest that the negative weighting problem identified by Goodman-Bacon (2021) may be at play, especially considering the temporal variation in public funding adoption. Additionally, the inconsistencies between global and regional estimates could stem from ‘forbidden comparisons’ (Borusyak et al., 2021), where already-treated units (adopted public funding) serve as controls. These observations collectively underscore the complexity of the PFSI-corruption relationship and the limitations of relying solely on two-way fixed-effects estimates for policy inferences in complex and heterogeneous political and institutional contexts.
For a better understanding of how the change in PFSI is associated with corruption, we provide in the online Appendix E (Figures E1-E6) marginal effects plots showing how the increase of PFSI is associated with high uncertainty around the slope line. This uncertainty can already be expected if one explores the distribution of corruption scores across PFSI in raw data as shown in the online Appendix F (Figures F1-F2). These figures indicate the absence of a clear negative relationship between PFSI and corruption.
Our results, especially based on regional samples, alternative operationalizations of PFSI and model specifications, depict a considerably more pessimistic picture regarding the effect of DPF in restraining corruption using PFSI as the key explanatory variable. While our reanalysis cast doubts over the reliability of Hummel et al. findings, we aimed to draw attention to the flawed approach to measuring public funding epitomized by the PFSI score. The cumulative effect of identified shortcomings in Hummel et al.’s study outweighs the authors’ effort to provide a systematic analysis with extensive temporal and cross-national coverage of the relationship between public funding of parties and corruption.
Conclusion
This study has critically examined Hummel et al.’s (2021) influential analysis published in the British Journal of Political Science on the relationship between political finance reforms and corruption. While their ambitious cross-national study spanning 175 countries over 115 years represents a landmark contribution to our understanding of political financing reforms, our replication reveals several important methodological considerations that warrant attention.
Our analysis identified five key issues with the PFSI: redundancy in subsidy indicators, potentially problematic coding of campaign subsidies, challenges in majority/minority public funding categorization, concerns about the stock variable approach, and data accuracy issues particularly evident in post-communist countries. These methodological concerns prompted us to conduct alternative analyses using modified PFSI operationalizations and different model specifications.
The results from our replication paint a more nuanced picture than the original study. While we fully reproduced Hummel et al.’s findings using their exact model specifications, alternative approaches revealed considerable regional heterogeneity and sensitivity to model choice. Notably, the relationship between PFSI and corruption appears robust only in the MENA region and, to a lesser extent, in post-communist countries. The divergence between country fixed-effects, year fixed-effects, and two-way fixed-effects models raises important questions about the consistency of the relationship between political finance reforms and corruption reduction. This divergence may reflect potential biases in two-way fixed-effects estimators recently identified in econometric literature, including issues with heterogeneous treatment effects, staggered adoption of reforms, and problematic comparisons between early and late adopters of public funding.
These findings suggest that the relationship between political finance reforms and corruption may be more complex and context-dependent than previously thought. While our analysis indicates that our data re-coding has a limited impact on the global results, it underscores regional heterogeneity and contrasting findings contingent on modelling approaches. Rather than invalidating the potential of public funding to combat corruption, our results indicate that its effectiveness likely depends on regional contexts and specific institutional arrangements that deserve further investigation. Future research would benefit from more refined measurements of public funding, careful consideration of regional variations, and robust empirical strategies that account for the methodological challenges identified in this study.
As democracies continue to grapple with political corruption, understanding the impact of finance reforms remains crucial. Our findings underscore the importance of approaching this relationship with methodological rigour while remaining open to the possibility that the effects of such reforms may vary significantly across different political and institutional contexts. This nuanced understanding is essential for policymakers considering political finance reforms as a tool for combating corruption.
Supplemental material
Supplemental Material - Do political finance reforms really reduce corruption? A replication study
Supplemental Material for Do political finance reforms really reduce corruption? A replication study by Sergiu Lipcean and Fernando Casal Bértoa in Research & Politics.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Carnegie Corporation of New York Grant
This publication was made possible (in part) by a grant from the Carnegie Corporation of New York. The statements made and views expressed are solely the responsibility of the author.
Supplemental Material
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
