Abstract
The availability of datasets scraped from the European Union's websites has greatly advanced the systematic analysis of European integration. But despite their enormous value, European Union databases contain almost no information about policymaking during Europe's first two decades, and for later periods, they suffer from far more inconsistencies and errors than has been previously recognised. This article draws upon extensive archival research and manual coding to identify and correct several of these limitations. I present a new dataset (EUPROPS) containing information on proposals for European Union policy from 1958 to 2021 and their outcomes. To illustrate the value of the dataset, I present some surprising initial findings about patterns of policymaking across this 60-year period and identify avenues for future research.
With the increasing use of systematic data on European Union (EU) policy proposals and adopted laws, the study of European integration has taken great strides towards becoming a ‘normal science’. Scholars have exploited large-n datasets to examine a wide range of questions, such as about the volume and pace of policymaking, strategic agenda-setting, the effects of EU enlargement, division of labour within the Council, the extent of delegated authority, EU accountability, coalition building, the relevance of package deals, the selection of European Parliament rapporteurs, potential trade-offs between efficiency and legitimacy, the (ir)relevance of formal and informal rules and procedures, the choice of different legal instruments and the impact of institutional reform (Kaeding, 2004; Golub, 2007; König, 2007; Häge, 2008; Hertz and Leuffen, 2011; Rasmussen and Toshkov, 2011; Kardasheva, 2013; Klüver and Sagarzazou, 2013; Crombez and Hix, 2015; Toshkov, 2017; Hurka and Steinebach, 2021; Rauh, 2021).
To construct their datasets, scholars often scrape, or otherwise extract, information from databases run by the EU. These datasets focus primarily on three main types of proposals: proposed Council directives (which I shall refer to as pdirs), proposed Council regulations (pregs) and proposed Council decisions (pdecs). All three originate most frequently from the European Commission, although some pdirs/pregs/pdecs are proposed to the Council by other actors such as the European Central Bank or individual member states, and all three specify varying degrees of involvement by the European Parliament. Simplifying slightly, originally, there were two main databases. Prelex was run by the Commission to catalogue each proposal's procedural steps in the legislative process, and CELEX was run by the EU Publications Office to provide an automated legal documentation system based on information drawn mainly from the EU's Official Journal. These were eventually merged into the single online database EUR-Lex, with the CELEX data for a given proposal available via the document information link and most of the Prelex data accessible via the procedure link (Düro, 2009; Blom-Hansen, 2019).
In this article, I identify a variety of problems with the main EU databases and the leading datasets extracted from them and then take steps to correct them. I demonstrate that the well-known limitations of EU databases are only the tip of the iceberg. Not only do we still know almost nothing about proposals made before 1975, but for the post-1975 period, both databases and the datasets that have been scraped from them suffer from far more missingness and errors than has been previously recognised. By combining extensive archival research with information obtained from EU databases and linked documents, I am able to present a new and more accurate dataset (EUPROPS) containing information on proposals for EU policy from 1958 to 2021 and their outcomes.
I highlight four main problems I discovered in Prelex and EUR-Lex: over- or underinclusion of proposals and missing or erroneous values for key independent variables. Then, I discuss the strategy I employ to rectify the shortcomings of Prelex and EUR-Lex, which involved a gruelling forensic process whereby I manually examined 23 years of documents from the Commission and Council archives, as well as EUR-Lex's main pages and procedure pages for 1958 to 2021. I describe how I identify and code individual proposals and track them to their eventual outcome. Subsequently, I present the fruits of this labour: EUPROPS, a new dataset containing information on proposals for EU policy from 1958 to 2021. EUPROPS contains 29,964 proposals (2938 pdirs, 16,241 pregs and 10,785 pdecs), 4188 of which were in neither Prelex nor EUR-Lex. EUPROPS is freely available on the journal website and on Harvard's Dataverse. My preliminary analysis of the data casts the history of EU policymaking in a new light, challenging conventional accounts of relative paralysis in the 1960s and 1970s. The first 25 years appear to be the heyday for EU policymaking, after which it consisted of doing less and less each year and doing it ever more slowly. In the concluding section, I identify some of the practical advantages the new dataset provides for those who track EU policymaking and discuss how it might be exploited to advance further research in a number of areas.
Problems with web-scraped EU policy data
Some of the limitations of EU databases are well known. As of 2006, CELEX and Prelex were both incomplete prior to 1984, and even for the post-1984 period, both were unreliable due to missing data for the legal basis of, and procedure applicable to, many proposals (König et al., 2006). As of 2011, Prelex contained almost no information about proposals made prior to 1975; it suffered from frequent inconsistent usage of descriptor values and often erroneously assigned to proposals procedures that did not exist when the proposals were first made (Häge, 2011). Blom-Hansen noted in 2019 that although decisions may be substantively important, EUR-Lex only includes a small minority of them (2019:702). As of 2021, even after years of updating, EUR-Lex's list of proposals is, according to Rauh (2021), still only complete for 1994 onwards, and the information about each proposal is often missing or erroneous: missing document identifiers, titles, legal bases and dates (Ovadek, 2021:9-12), documents wrongly classified as original when the title indicates they are amended proposals and adopted laws missing from the procedure pages in the most recent years due to archiving delays (Rauh, 2021:12).
To identify the shortcomings of EU databases, I began by systematically comparing data from the three leading datasets (Häge, 2011; Hertz and Leuffen, 2011; Rauh, 2021), the leading software package designed to scrape EUR-Lex (Ovadek, 2021) and EUR-Lex's built-in search and export feature. 1 Since the merger, the Prelex database no longer exists, but the well-known EUPOL dataset compiled and disseminated by Häge (2011) contains a complete scrape of Prelex as of 2011. For their EULO dataset, Hertz and Leuffen (2011) also scraped proposals from Prelex and then added what is effectively CELEX information scraped from EUR-Lex. More recently, Rauh (2021) scraped a list of proposals from EUR-Lex and then added information extracted from the procedure links. Ovadek (2021) provides a eurlex package, written in R, designed to scrape a wide range of information from EUR-Lex, and it has both advantages and disadvantages compared to EUR-Lex's built-in search function whereby users can export the results from bulk searches. Below, I highlight the major problems I discovered in all these sources. Besides the footnotes, the Online appendix provides further examples of the types of errors I identify. Additional examples are available from the author upon request.
Prelex's errors are evident in the Häge (2011) and Hertz and Leuffen (2011) datasets. As far as I could determine, most of these errors still exist on the EUR-Lex procedure pages. 2 Both datasets are underinclusive because they contain almost none of the 4884 policy proposals made before 1975, they omit 3340 (i.e. nearly 17%) of COM proposals and either 102 or 987, respectively (i.e. 10% and 100%) of SEC proposals contained in 1975 to 2011 COM and SEC documents, they often treat proposals as non-proposals and they frequently conflate multiple proposals into a single row of data. At the same time, a second systematic problem is that both datasets are overinclusive. They often tag as original pdir/preg/pdec texts which contain no policy proposals (e.g. communications, reports, recommendations, resolutions, opinions, letters from the budget authority and budget statements), as well as texts which are amended proposals. Whereas amended proposals (e.g. COM(1984)0470) should not be included, the thousands of original pdirs/pregs/pdecs that amend existing policy – i.e. amending acts – should be included (e.g. COM(1985)0762). Overinclusion also occurs because they inadvertently classify as pdir/preg/pdec numerous proposals for Commission rather than Council policy – instruments with titles such as ‘Proposal for a Commission Regulation’ or ‘Draft Commission Decision’ – and by wrongly treating single proposals as multiple. 3 Both datasets have thousands of cases where values for the title, proposal date, legal basis, end date, outcome and adopted law(s) are either missing or don’t match the information on EUR-Lex, and only some of these problems are attributable to proposals being pending and thus right-censored. I also discovered that Prelex is internally inconsistent regarding the proposed legal basis – information in the two relevant fields which both contain the ‘original’ legal basis often differs.
EUR-Lex's errors are evident in the scrapes using the eurlex package and the direct search and export function. 4 EUR-Lex contains documents for the period from 1958 to 2023, so its temporal coverage is much longer than Prelex, but it, too, suffers from considerable under- and overinclusion. For a start, it lacks nearly 1600 (i.e. over 9%) of the COM proposals found on Prelex. The majority of these missing proposals either originates from pre-1984 or are for pdecs, but quite a few more recent pregs and pdirs are also missing. It also lacks over 500 (nearly 63%) of the policy proposals contained in SEC documents on Prelex. EUR-Lex also frequently treats multiple proposals as one, especially prior to 1984. On the other hand, EUR-Lex wrongly lists as pdirs/pregs/pdecs numerous texts which contain no proposals, modified proposals or proposals for Commission rather than Council policy, and it sometimes wrongly treats single proposals as multiple.
A severe shortcoming of both EUR-Lex scraping methods is that several key variables are entirely or partially missing or they often contain information that does not match Prelex. Neither Ovadek's (2021) R package nor EUR-Lex's search and export function grabs the actual outcome of the legislative process from the document information page (the adopted law(s) and whether the proposal was withdrawn or replaced), just the final date. Moreover, the final date and legal basis are frequently missing, systematically for proposals made before 1984 but often for more recent proposals. End dates are consistently missing for proposed Council policy that is eventually adopted as Commission policy. For both EUR-Lex scraping methods, in thousands of cases, the harvested legal basis for each proposal is incomplete because it lacks subsections of treaty articles or prior legislation. Also neither method extracts any of the Prelex fields from the procedure links. Presumably, both the eurlex package and the EUR-Lex export function could be improved to grab more of this currently unscraped information, much of which is visible in EUR-Lex's ‘date of end of validity’ and ‘legal basis’ fields on the document information page when one manually searches for individual proposals. However, I also discovered thousands of cases where the scraped or unscraped information on the EUR-lex document information pages did not match Prelex for the title, proposal date, legal basis, end date, outcome or adopted law(s).
Finally, Rauh (2021) deliberately chose a shorter temporal period and only scraped EUR-Lex for Commission proposals made from 1985 to 2016. But even for this time period, his dataset is more underinclusive than the other two EUR-Lex scrapes. It lacks the same COM proposals as they do, as well as hundreds of others, and all the proposals found in SEC, JAI, HB and JC documents. Although Rauh effectively limits overinclusion by almost entirely excluding amended proposals and texts that contain no proposals, his dataset still inadvertently contains some proposals for Commission pdecs/pregs/pdirs, as well as two dozen duplicate cases that code provisional data. In terms of missingness, Rauh's dataset lacks several of the variables that can be scraped by the eurlex R package and the EUR-Lex direct search function and many of the former Prelex variables available on the procedure pages. However, it does have a clear advantage in that it contains two procedure variables where available: the initial legal basis and the last adopted law for each proposal, including Joint Committee and Association Council legislation (e.g. COM(2000)0454, COM(2000)0253). While Rauh's approach of combining EUR-Lex and procedure information makes sense, it does not resolve the underlying errors and inconsistencies in the two sources.
Overall, my investigation reveals that EU databases and the datasets scraped from them contain almost no information about legislative policymaking before 1975 and that they suffer from a remarkable number of inconsistencies and errors. For thousands of proposals, Prelex and EUR-Lex differ on the title, instrument type, proposal date, legal basis, outcome, adoption date(s) and adopted law(s). In all of these cases, at least one of the databases must be wrong, and in some cases, it turns out that both are wrong. Of course, all errors on EU databases might not be equally serious or equally relevant for all users; it will depend on how they intend to exploit the data. But the fact that in many cases Prelex and EUR-Lex do record complete and accurate information – for example, the precise treaty article(s) and portions of secondary legislation that form the legal basis for a proposal or all of the laws adopted from a proposal – suggests that all the errors are potentially important.
Fixing the problems
To rectify the shortcomings in Prelex and EUR-Lex, I manually examine 23 years of documents from the Commission and Council archives, as well as EUR-Lex's main and procedure pages from 1958 to 2021. The research initially involved numerous trips to the archives in Brussels, but as the EU digitised more records, the Commission's, Council's and European University Institute's (EUI) historical archive websites became invaluable resources.
I first identify the final versions of all the pdirs/pregs/pdecs from over 100,000 Commission documents issued from 1958 until 1991 that could potentially include proposals and coded each one's proposal date, title and legal basis. 5 In most cases, identifying the final version of each proposal is straightforward since the COM or SEC documents indicate ‘final’ and ‘presented to the Council’ on the first page. It is important to note that final versions are routinely titled ‘draft’ or ‘project’, particularly for pregs and pdecs. However, in some cases, the final version is not available in the Commission archives, only in the Council archives. In many cases, there simply is no version labelled ‘final’. I treat these cases as policy proposals when Council archives indicate that they were submitted to the Council; otherwise, I exclude them as pre-proposals. 6 Wherever available, I record the French title for proposals prior to the UK's entry in 1973 and then English titles thereafter. In some cases, I discovered that one language contained errors in the title, type or legal basis and I correct these by using information from other languages. 7
I then use information from Council archives, Commission archives, EUI archives and the Official Journal to track each proposal to its outcome(s). Archived records of Council negotiations provide the gold standard for tracing the fate of policy proposals, but as with all archival research, I had to devise workarounds when I encountered clerical errors, such as mislabelled folders, missing pages or misdirected web links. Tracking proposals is often difficult because you can only use French to search the Council archive's database. Even then, sometimes the search engine only locates files if searched for by a single word, or even a word fragment, rather than the proposal's title or document number. For the end date and the law(s) adopted from each proposal, I code the first adoption but also list all the laws adopted. In many cases, especially pdecs, a proposal is adopted with no legislation. For proposals that yield Association Council decisions or similar (sector two documents on EUR-Lex), I code the date of Council adoption, and in an improvement over EUR-Lex, I also list the Association Council decision itself, which is often finalised much later. If the proposal is never adopted, I list the date it was replaced, withdrawn or rejected. I was unable to find adoption dates or legislation for some proposals related to the opening of international negotiations that we know were later concluded, so the proposals must have been adopted at some point but the relevant details were not recorded. For proposals where the Council position is not indicated or that are officially ‘not adopted by Council’, but then result in Commission legislation, I code the date of the Commission law. 8 Proposals that are officially not adopted (or where Council archives refer to non-adoption) and result in no legislation of any sort are treated as right-censored as of the last date I could find of Council activity. Interestingly, archival records indicate that 73 proposals were adopted before they were officially proposed. For these cases, I adjust the outcome date and treat the proposals as if they were adopted on the same day they were proposed. For a handful of cases that were adopted, I was unable to trace the original Commission proposal, or the Council archives indicate it is missing. I treat these cases as left-censored as of the earliest date where the Council or European Parliament discussed them.
Finally, I also identify 367 cases which I flag with an ‘only Council’ indicator because, according to the Council archives, the original policy proposal was formally made by an individual member state, the Council Secretariat or the European Court of Justice, with no Commission document. 9 Originating many years before member states could propose policies via the ‘initiative’ instrument, they deal mainly with the salaries and pensions of community officials or with changes to the common customs tariffs. I treat them as uncensored, and for the proposal date I record the earliest date that the Council discussed them. My dataset is the first to include such ‘only Council’ proposals, but as noted below, I didn’t capture them all.
Archival records only become available after 30 years, so to identify and track proposals and outcomes beyond 1991, I rely on the Official Journal, materials kindly provided by the EU Publications Office and documents accessible via EUR-Lex's main and procedure pages. I then add in proposals from Häge (2011) that I couldn’t find in the archives or on EUR-Lex. Finally, wherever available, I assign each proposal its unique ‘webno’ identifier from Häge (2011) and its unique ‘cellar reference’ from EUR-Lex. This enables users of my new dataset to merge in additional variables from Häge's (2011) scrape of Prelex and from their own scrapes of EUR-Lex. Having both identifiers also helps distinguish between proposals made during the period from 1958 to 2011 since Prelex and EUR-Lex often attach different document numbers to the same proposal, especially for multi-proposals.
How complete is EUPROPS? One important metric is whether I managed to track all the proposals to a definite outcome, whether that be adoption, replacement, rejection or withdrawal. Across the entire period from 1958 to 2021, I was able to identify the outcome for 96.7% of the 29,964 proposals. Nearly a third of all the untraced proposals in my dataset originated in 2017 or more recently, so many are likely still under discussion in the Council.
A second important metric is whether I traced all the laws adopted from 1958 until 1986, the period where Prelex, EUR-lex and previous datasets are most sparse. Based on a list compiled from EUR-Lex's ‘directory of legal acts’ (a subset of all sector three documents), EUPROPS accounts for all of the adopted Council directives and all but 10 of the adopted Council regulations. However, it is somewhat incomplete for adopted Council decisions. I believe I have traced all Council decisions that stem from Commission proposals, but I have not accounted for many Council decisions that appear to be ‘only Council’ actions, having originated with no Commission proposal or recommendation, just a member state request. These normally comprise only a handful each year and relate mostly to the common commercial policy, remuneration of community officials, conclusion of conventions and association agreements, implementation of the EU budget, the extension of state aid or nominating members of various advisory committees (e.g. 31962D1219, 31964D0345, 31967D0146, 31977D0342, 31978D0385, 31980D0025, 31982D0134, 31983D0520 and 31986D0014).
Besides the issue of untraced ‘only Council’ decisions, EUPROPS has several other important limitations. First, it does not contain information about policy proposals and outcomes that is only available in archival records since 1992. However, the dataset will be updated as more archival records become available. Second, although Commission policy may be substantively important, and might have increased in scale since the 2009 Lisbon Treaty (see Blom-Hansen, 2019:702; Williams and Bevan, 2019), EUPROPS retains the focus on Council policy. It only incidentally contains information about Commission decisions, Commission regulations or Commission directives if they originate from a proposal for a Council policy. This excludes some types of tertiary acts – ‘acts based on provisions in secondary acts’ (Blom-Hansen, 2019:700) – such as all comitology activity, all Commission delegated decisions/regulations/directives and all Commission implementing decisions/regulations/directives. Nonetheless, EUPROPS does include the thousands of acts that arise from proposals for Council policy where the sole legal basis for the proposal is a provision in a secondary act.
My decision to exclude the bulk of Commission policy was taken on intellectual and practical grounds. Intellectually, nearly all the previous large-n analyses I am aware of have focused on Council policy. This not only includes ones like those listed at the start of this article that scrape EU databases but also those like the EU Decides project and its offshoots that use EU databases to capture the titles, text, procedures and outcomes for a small subset of all proposals and adopted laws (Thomson et al., 2006; Thomson, 2011; Aksoy, 2012; Arregui and Perarnaud, 2021; Golub, 2022), as well as studies that examine opt-outs from EU legislation (Duttle et al., 2017; Princen et al., 2022; Zbiral et al., 2022). Even large-n studies of delegation focus on the content of Council policy (Franchino, 2007; Brandsma and Blom-Hansen, 2017). Practically, it may not even be possible to track most Commission policy from proposal to adoption or withdrawal. Neither the Commission nor Council archives that are publicly available appear to contain information about comitology proposals and other proposals for delegated acts. EUR-Lex holds no information on comitology proposals and almost no information on proposals for Commission delegated or Commission implementing acts.
Third, EUPROPS might undercount pdecs for the post-2011 period for which I rely on EUR-Lex to identify proposals. This is due to how the two EU databases were merged. Prelex contained many more proposals than CELEX for Association Council and governments of the member states’ decisions. Because it is difficult to systematically locate or search EUR-Lex procedure pages that are not accessed via a EUR-Lex proposal page, and because not all of Prelex was merged into EUR-Lex, I likely miss some post-2011 pdecs related to these sorts of policies. Fourth, I do not include what I would refer to as non-policy documents from the Commission, even if in some instances these were transformed by the Council into directives, regulations or decisions. Non-policy documents include such things as green or white papers, working documents, Commission opinions and proposals for recommendations, accords or programmes. Fifth, I only try to identify certain types of errors in EU databases and the datasets scraped from them. I have not, for example, attempted to verify information contained in EUR-Lex and Prelex about the procedure applicable to each proposal, the main policy area(s) it addresses, the Directorates-General involved in drafting the proposal or the legal basis of the adopted legislation. Finally, I have not attempted to code the extent and quality of integration generated by each proposal or how its substantive content changed when ultimately agreed on by the Council (and European Parliament).
The big picture: 60 years of EU policymaking
Conventional accounts of EU policymaking refer to the 1960s, 1970s and early 1980s almost as the dark ages and point to the 1987 Single European Act (SEA) as a key reform that relaunched integration. Such accounts maintain that the Luxembourg Compromise in January 1966 ‘had a phenomenal political impact upon Community decision-making’ (Baquero Cruz, 2006:273), ‘ushered in nearly two decades of de facto unanimous agreement in the Council’ (Hix and Hoyland, 2022:60) and thereby produced ‘a state of semi-paralysis that ensued and from which the European Communities were only partially released many years later’ (Palayret, 2006:46). By contrast, policymaking since 1987 supposedly exhibited ‘renewed dynamism’ (Phinnemore, 2022:18), such that by 2002, European integration was viewed as a ‘never ending success story’ (Schneider, 2002). Subsequent descriptions of EU policymaking were only somewhat less rosy. As of 2011, Häge found ‘no clear-cut trend’ (2011:467) – thus no decline – in the supply of proposals and that the proportion of proposals adopted had ‘increased somewhat in the long term’ (2011:470), although the amount of time it took the Council to reach agreements had increased substantially starting in the early to mid 1990s (2011:475). Other observers looking back on the period since 2004 suggest that the EU has ‘generally maintained a “business as usual” level of policy productivity’ (Pollack et al., 2020:480).
Preliminary analysis of EUPROPS casts the history of EU policymaking in a very different light. Overall, policymaking during the community's first 25 years, including the period after the Luxembourg Compromise, was not marked by Eurosclerosis or general paralysis. Quite the contrary: by several criteria, this period was its heyday. As shown in the top left panel of Figure 1, the number of proposals grew sharply each year, from 22 in 1960 to 773 by 1980. The number of proposals adopted each year also exploded, reaching 658 in 1976, the second highest in the EU's entire history (top right, Figure 1). The huge number of pdirs, pregs and pdecs processed contradicts the claim that in the decade following the Luxembourg Compromise the Council depended ‘mainly on non or extra-treaty devices such as “resolutions”’ (Bieber and Palmer, 1975:311).

Volume, adoption rate and pace of EU policymaking.
Also contrary to conventional accounts, the vast majority of proposals made during the 1960s, 1970s and early 1980s were eventually adopted, not permanently stuck in deadlocked Council negotiations or formally withdrawn by the Commission. Nearly all the proposals made before 1961 were adopted – which might just reflect the small numbers involved – and after that, the adoption rate remained around 90%. The sole exception was 1969 when it dropped to 83% (bottom left, Figure 1). What makes this dramatic expansion of policy all the more remarkable – and again, contrary to standard accounts about the debilitating effects of the Luxembourg Compromise – is that the typical duration of Council policymaking, measured by the median number of days from proposal to adoption, rose by only three weeks, from about 45 days in the early 1960s to about 70 days in the early 1980s (bottom right Figure 1). Policymaking was faster during the period from 1961 to 1973 than at any subsequent point in EU history, with 1966 and 1967 having the lowest median duration of less than one month.
Compared to the pre-SEA period, which saw an explosive growth of proposals and adoptions, with only a marginal rise in median duration, policymaking during the period from 1987 to 2009 appears to have involved doing less and less each year and doing it ever more slowly. Not only did the median duration double from 70 days in 1987 to a maximum of 154 in 2008, but it did so while the number of yearly proposals collapsed from 752 in 1986 to 396 in 2009 and without an increase in the proportion of proposals adopted.
In the aggregate, EU policymaking since 2009 has arguably been even less productive. The number of yearly proposals bounces around but mostly continues downwards, hitting just 261 in 2019. The proportion of proposals adopted remained near 90% until 2015 but then dropped off, falling below 80% in 2017 and 2019. As of 18 July 2023, only 69% of the proposals made in 2021 had been adopted, a surprisingly low figure even allowing for right-censoring. Trends in the median duration are less clear. It fluctuates widely, dropping back below 80 days in 2009 and 2014, but reaching 183 days in 2016 and an all-time high (apart from the small-numbers case in 1960) of 218 days in 2018. The lower median durations for the two most recent years of data might reflect the large number of pending proposals but could also indicate a reversal in the long-term slowdown.
Dividing the data by policy instrument type provides a more nuanced view of these aggregate trends. As shown in Figure 2, the explosion in policy activity during the 1960s, 1970s and early 1980s involves all three types of instruments but especially regulations and, to a lesser extent, decisions. The number of proposed and adopted directives also increased but not nearly as much. Likewise, the collapse in proposals and adoptions after the 1987 SEA mainly affects regulations, and their number has continued to fall ever since. The number of proposed and adopted directives remains fairly constant until a steady decline since about 2009. Bucking both these trends, the number of proposed and adopted decisions continued to climb sharply until 2008. Their steep drop after that point might reflect the general reduction in policy activity since 2008 but might also indicate, as mentioned earlier, that my dataset probably undercounts post-2011 pdecs due to the merger of Prelex into EUR-Lex. Either way, since the early 1980s, there has been a clear shift towards pdecs as the preferred type of policy instrument.

Policy input and output by instrument type.
The data confirm that Council negotiations over directives are particularly protracted (see Golub, 2007; Hertz and Leuffen, 2011; Toshkov, 2017) but also suggest that, across the history of the EU, the key dynamics of policymaking have remained relatively constant for this particular instrument. As shown in the right-hand panels of Figure 3, pdirs typically require much longer periods of Council negotiations before they are adopted, roughly 600 days as opposed to 90 for pdecs and 70 for pregs. The adoption rate for pdirs was extremely low in 1969 (57.7%), 2010 (66.7%) and 2014 (69.2%) (also in 1961, but that was due to tiny numbers) but otherwise has remained around 85% or higher. Likewise, although pdirs initiated in 1965 and 1969 had by far the longest median durations ever (1286 and 1759 days, respectively), these were exceptional. Overall, policymaking speed for pdirs increased somewhat in the 1970s and early 1980s and then slowed down until about 2005 and has shown great variability since then.

Adoption rate and pace of EU policymaking by instrument type.
Patterns for the other two types of instruments are starker. The adoption rate for pregs, although highly variable, tended to increase until about 1994 and has declined steadily ever since. The adoption rate for pdecs, also noisy, remained at around 85% until a sharp fall-off starting in 2016. The median duration for pregs, which had risen slowly and steadily throughout EU history, has risen dramatically since 2010. The median duration for pdecs more than tripled from 1972 to 1987, peaked at 193 in 1995 and then fell sharply, albeit unevenly, over the next 25 years.
In order to explain and interpret the trends displayed in Figures 1–3, it is essential to recognise that several key aspects of EU policymaking changed over these 60 years. Each of the five treaty reforms – not just the SEA – modified the institutional rules, especially the scope of qualified majority voting and the role of the European Parliament, which grew significantly from consultation to cooperation to co-decision. Also, the number of member states increased from six to a maximum of 28, and each of the 13 different Commission presidents had their own particular objectives and influence. Changes to all these various factors potentially affect the volume, adoption rate and pace of EU policymaking. 10
As a rough first cut at bivariate analysis, the EUPROPS data can be visualised by treaty period and by Commission presidency. Figure 4 confirms the earlier finding that EU policymaking under the Treaty of Rome – applicable to the first 29 years covered by EUPROPS – was surprisingly productive. Subsequent changes to the institutional rules produced a downward trend in the volume of proposals and adoptions, even for Nice and Lisbon, given the relatively long periods of time they were in force.

Volume, adoption rate and pace of EU policymaking by treaty period.
The proportion of proposals adopted shows little variation across the treaty periods, apart from a sharp decline under Lisbon, which might partly be due to right-censoring. The increase in median duration under the SEA and most later treaties could suggest that the extra time involved with the cooperation procedure, and especially the co-decision procedure, more than offset any efficiency gains from widespread use of qualified majority voting. Interestingly, although the drop in median duration since Lisbon could be caused simply by right-censoring, it more likely reflects changes to how co-decision operates in practice, for instance, with time-saving trilogues (Toshkov and Rasmussen, 2012).
Figure 5 breaks the data down by Commission presidency, from Walter Hallstein (1958 to 1967) to Ursula von der Leyen (as of December 2019). Overall, it appears to confirm that who holds the Commission presidency might explain at least some of the variation in policymaking trends (Kassim et al., 2017). Interestingly, although Hallstein, Jenkins and Delors have been described as especially effective presidents (Kassim, 2012) and Malfatti, Thorn and Santer as ‘non-performers’ (Dinan, 2012) – which by some criteria might be true – the data don’t entirely bear this out.

Volume, adoption rate and pace of EU policymaking by Commission presidency.
As shown in the top two panels, the sheer volume of policy really took off during Jean Rey's presidency, which lasted only one-third as long as Hallstein's, yet produced roughly the same number of proposals and adoptions. In relative terms, it increased even further under Malfatti and Mansholt, whose stints as Commission Presidents were approximately one-fifth and one-twelfth as long as Hallstein's. The rate of increase then levelled off under Ortoli, Jenkins and Thorn, who each held the position for only two-fifths the amount of time as Hallstein did. Interestingly, it remained about the same under Delors. There were more proposals and adoptions under his presidency than under any other in EU history, but Delors was in post far longer, more than two-and-half times the tenure of each of the three previous presidents. The relative volume of policy then declined under each Commission presidency, including Barroso's, which lasted almost exactly as long as Delors’ but produced 30% fewer proposals and adoptions.
In terms of the proportion of proposals eventually adopted, there was almost no difference between Commission presidencies until the drop-off under Juncker and von der Leyen (Figure 5, bottom left panel). By contrast, the median number of days required to adopt a policy under each presidency clearly differs. As shown in the bottom right panel of Figure 5, median duration is extremely low under Hallstein, Rey and Malfatti, increases slightly starting with Mansholt, falls back slightly under Thorn, rises to a maximum under Santer and then begins to decline. Note that, as mentioned earlier, the recent decline in both the adoption rate and the median duration might be partly attributable to archiving delays or right-censoring.
Conclusion
This article identifies key shortcomings of EU legislative databases and takes steps to correct them. Information extracted from EU databases about policy proposals and their treatment by the Council has proven invaluable for the systematic study of European integration, but these databases – initially Prelex and CELEX, later merged into EUR-Lex – offer little or no coverage of the EU's first two decades and, as I discover, regardless of the time period, are plagued by errors and inconsistencies. My EUPROPS dataset contains more reliable information on EU policy proposals made from 1958 until 2021 and their outcomes.
Constructing EUPROPS involved a painstaking search through 23 years of Commission and Council archives, 65 years of EUR-Lex data and a systematic comparison of leading EU datasets. Besides addressing extensive under- or overinclusion, where EU databases wrongly omit relevant policy proposals or wrongly count non-proposals as proposals, I also correct thousands of instances where EU databases record erroneous information pertaining to the title, type, legal basis or eventual outcome of proposals. The scale of these problems has not previously been recognised nor has the fact that Prelex and EUR-Lex often contain conflicting information about proposals.
EUPROPS provides practical advantages for individuals and groups who track EU policymaking. For example, it gives users the newfound ability to trace proposals made and policies adopted during the period from 1958 until 1975, for which EUR-Lex (and scrapes of what used to be Prelex) contains almost no information. It also provides users with more reliable information about policymaking in more recent periods. By recording all the laws adopted for each proposal, it improves the ability of users to investigate what Jupille (2004) denotes as ‘fusion’ and ‘fission’, which is when the Council combines multiple proposals into a single adopted law or splits a single proposal into multiple adopted laws. It also enables users to identify the precise subsections of treaty articles and secondary legislation that provide the legal basis for each proposal.
Preliminary analysis of EUPROPS reveals patterns that challenge conventional accounts of EU policymaking as being paralysed for decades until the ‘relaunch’ by the 1987 SEA. Overall, the 1960s, 1970s and early 1980s were not marked by an absence of Commission proposals and by painfully slow Council policymaking but instead witnessed an explosion of policy activity and some of the most rapid policymaking in EU history. By contrast, policymaking since 1987, and especially since 2009, appears to have involved fewer and fewer proposals each year, more protracted Council negotiations and a drop in the proportion of proposals ultimately adopted. Interestingly, though, trends in the volume, median duration and eventual adoption rate differ markedly across the three types of policy instruments.
EUPROPS can be used to explore these patterns and to advance theoretical research in a number of different areas. One such area is the further analysis of the Luxembourg Compromise. As noted earlier, the effects of the Luxembourg Compromise appear to have been greatly overstated, but that does not resolve the deeper theoretical question about the relevance of formal voting rules. The reason that we do not see a gear change in 1966 – faster policymaking and an even higher proportion of adoptions – might be because both before and after the Compromise the Council operated by an informal consensus norm and de facto unanimous voting on all proposals (Kleine, 2013:92–94). Alternatively, the effectiveness of formal voting rules and the increasing ‘shadow of the vote’ might provide the key explanation for how the Council was able to process the huge increase in proposals after 1965 so efficiently, with only a modest increase in median survival time. Golub (2006) finds that formal majority voting significantly expedited policymaking both before and after the compromise, but he only examines pdirs, not pregs or pdecs, and he does not conduct a survival analysis. EUPROPS provides the essential ingredients for a more systematic large-n investigation: the proposal date, outcome and legal basis for thousands of policy proposals in the 1960s and 1970s.
A second theoretical area involves the study of EU enlargement. Scholars disagree about whether enlargements in 1981, 1986, 1995 and 2004 slowed down policymaking. But the leading studies either rely on data extracted from Prelex and EUR-Lex (König, 2007; Hertz and Leuffen, 2011; Klüver and Sagarzazou, 2013; Toshkov, 2017), which, as discussed above, is plagued by errors and also largely missing for the pre-1975 period, or only examine pdirs, not pregs or pdecs (Golub, 2007; also his data only extends back to 1968). Moreover, no study has yet focused explicitly on the impact of the 2007 or 2013 enlargements. By correcting mistakes in EU databases and extending their temporal scope, EUPROPS provides the foundation from which to identify the effects of all seven enlargements.
In addition to facilitating various large-n studies, EUPROPS also provides scholars with the material to motivate a smorgasbord of small-n and qualitative analyses about EU policymaking. For example, they could investigate what caused the exceptional spike in median duration for pregs made in 1960. Likewise for pdirs made in 1969, which experienced an unusually low adoption rate and median durations roughly three years longer than normal. Do records from national and EU archives indicate that there was something distinctively controversial about these proposals or that they encountered unusual political dynamics? At an even more granular level, analysts might process trace individual proposals to explain why, for example, it took eight years to adopt COM(1968)0835-02 as the one-page Decision 31976D0894 establishing a standing committee on plant health or COM(1968)00471-01 as Regulation 31975R3279 on the importation of trees, bulbs and flowers from non-EU states, when both proposals were formally subject to majority voting. Did deep divisions across several groups of member states prevent the emergence of a winning coalition, or did a lone member state successfully appeal for de facto unanimity? Or why it took 21 years to adopt COM(1969)0005 and COM(1969)0006 – both subject to unanimous voting – as Directives 31990L0434 and 31990L0435 on a common system of taxation for companies of different member states. Did the proposals face widespread national opposition, or did an isolated state successfully invoke their de jure veto right? EUPROPS does not directly answer these sorts of intriguing questions, but it does provide some of the essential data needed to identify them in the first place.
Supplemental Material
sj-docx-1-eup-10.1177_14651165231202034 - Supplemental material for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021
Supplemental material, sj-docx-1-eup-10.1177_14651165231202034 for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021 by Jonathan Golub in European Union Politics
Supplemental Material
sj-dta-2-eup-10.1177_14651165231202034 - Supplemental material for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021
Supplemental material, sj-dta-2-eup-10.1177_14651165231202034 for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021 by Jonathan Golub in European Union Politics
Supplemental Material
sj-docx-3-eup-10.1177_14651165231202034 - Supplemental material for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021
Supplemental material, sj-docx-3-eup-10.1177_14651165231202034 for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021 by Jonathan Golub in European Union Politics
Supplemental Material
sj-do-4-eup-10.1177_14651165231202034 - Supplemental material for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021
Supplemental material, sj-do-4-eup-10.1177_14651165231202034 for EUPROPS: A new dataset on policymaking in the European Union from 1958 to 2021 by Jonathan Golub in European Union Politics
Footnotes
Acknowledgements
The author would like to thank members of the archival teams at the European Commission, European Parliament, Council of the European Union and European University Institute for their invaluable assistance in helping me locate and access the historical documents that form the core of this project. Nathalie Montillot, Rob Dowling, Barney Jopson and Kenaoel Adamu provided excellent research assistance. For the enormously useful, and often amusing discussions we’ve had over many years about the minutiae of EU policymaking, the author is grateful to Joseph Jupille. The EUPROPS dataset used in this article is open-access and available at
.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
