Abstract
The article introduces the European Court of Justice Dataset (ECJ Dataset) – a unique collection of 3776 judgments decided by the European Court of Justice since 1952. The hand-coded information about the judgments includes case outcomes, the use of central judge-made concepts in European Union law, and characteristics of the parties to disputes such as gender. This detailed legal information is unavailable from any official databases or other existing data collections. Yet, it is indispensable to examine the effect of politics and other non-legal factors on legal outcomes in the context of the ECJ’s absolute secrecy of the deliberations and the non-disclosure of judicial votes. This dataset thus (1) advances legally informed analysis of judicial behavior and legal research informed by facts; (2) helps scholars grasp the working of a politically powerful international court; (3) furthers the theorizing of the world’s most advanced, developed, and stable forms of regional economic and political integration and (4) pries open one of the most obscure judicial institutions globally.
1. Introduction: Adding European Law to the Study of European Courts
The European Court of Justice (the ECJ) was established in 1952 by the Treaty of Paris as the Court of the Coal and Steel Community. It was composed of seven judges from the six founding member states and two Advocates General. Today, it is composed of 27 judges, one from each member state, and eleven Advocates General, employing over 2200 officials, temporary agents and contract staff (Court of Justice of the European Union, 2024b). The ECJ sits permanently in Luxembourg. 1
The ECJ’s task is to ensure that the law is observed in the interpretation and the application of the founding Treaties of the European Union (Article 19 TEU 2 ). It can review the legality of the acts of European Union (EU) institutions (“actions for annulment”, Article 263 TFEU 3 ), rule in infringement actions brought by the European Commission or the member states against other member states (Articles 258–259 TFEU), in actions for failure to act brought against the EU institutions for failing to act when required by EU law (Article 265 TFEU) and, most often and most importantly, authoritatively interpret EU law at the request of national courts and tribunals (the “preliminary reference procedure”, Article 267 TFEU). The ECJ also rules in appeals against the judgments of the General Court (256 TFEU). The ECJ is the busiest international court, receiving hundreds of cases per year; in 2023, it received 821 new cases and in 2022, 806 new cases. Of these, references for preliminary rulings and appeals constituted over 90% of the docket.
The ECJ typically emerges as a politically influential and legally authoritative actor in debates about European economic and political integration (De Witte et al., 2013; Rasmussen, 1986; Stein, 1981; Weiler, 1994). Scholars have attributed this highly unusual status for an international court (Alter, 2009; Helfer & Slaughter, 1997) to a mixture of external and internal forces – the former outside and the latter within the ECJ’s control (Helfer & Slaughter, 1997). The external forces, which enabled the ECJ’s rise to power include inattentiveness to the ECJ’s lawmaking by national governments over the first two decades of its existence and a warm reception of EU law by national judges, legal elites, academics and pro-European interest organizations (Alter, 2009; Weiler, 1994). Among the internal forces that solidify the ECJ’s authority are its strategy to legally tailor the law and legal arguments to the circumstances of concrete cases, and the broader societal context (Burley & Mattli, 1993; Hartley, 2014; Helfer & Slaughter, 1997; Schepel & Wesseling, 1997; Stein, 1981; Vauchez, 2010; Weiler, 1991, 1994).
The magnitude and the precise effect of these forces nonetheless remain empirically elusive. On the one hand, studies show that the ECJ rules in line with the policy preferences of the member states when facing the threat of legislative override (Maduro, 1998). On the other hand, they demonstrate that even the ECJ’s most far-reaching pronouncements rarely, if ever, meet significant pushback from the state governments (Martinsen, 2015; Schmidt, 2018). Scholars continue to lock horns over the question of whether the EU legislator composed of national governments and the EU institutions can meaningfully constrain and control judicial power, and thus, the process of European integration through law (Larsson & Naurin, 2016). Consequently, we face a challenge within legal research in attempting to articulate and measure the ECJ’s legal choices and assess their societal implications. This is what motivated the data collection. 4 The underlying assumption is that this challenge can be best solved with original and legally informed data – something to which the ECJ Dataset contributes.
The ECJ Dataset includes all judgments issued in three central policy areas of the internal market where the ECJ has been playing a key role (Bradford, 2020; European Commission, 2023): the free movement of persons and European citizens (1097 judgments), the free movement of goods (1538 judgments), and the freedom of establishment and to provide services (1306 judgments). 5 This corresponds to just over one fourth of all judgments ever published by the ECJ, including one third of all judgments in preliminary reference procedures. Between the late 1970s and 2010, around 30% of the judgments of the ECJ concerned these freedoms, with the share decreasing since 2010.
In qualitative terms, the selection of internal market case law is representative of the main fault lines of European integration, the power dynamics between the ECJ and the member state governments (the political actors), as well as the interplay of law and politics in adjudication. The internal market case law is doctrinally rich and has inspired and shaped all other areas of EU law.
The subject matter of the judgments and policy areas included in the ECJ Dataset can furthermore engage with at least three central contemporary debates. First, mobility of people remains one of the most divisive issues, particularly in relation to the free movement of workers, economically inactive EU citizens, and posted workers, all included in the ECJ Dataset. The latter fall within the free movement of services, dividing the Eastern European states from the Western European states (Kukovec, 2015). The former argue that the legislation on posting of workers is protectionist of the national labor markets; the latter claim that posting is a form of social dumping (Bogoeski, 2021; Garben, 2023). The debate resonates with the conflict between the economic and the social aspects of integration (Joerges & Rödl, 2009; Scharpf, 2010), which has recently re-flourished in the literature (van den Brink et al., 2025), and which the ECJ has helped to fuel with controversial rulings. These debates and political struggles are reshaping European national welfare models (Beukers et al., 2017; Bignami, 2020; Dawson & de Witte, 2013; Ferrara & Kriesi, 2021; Vandenbroucke et al., 2017), and national politics (Hooghe & Marks, 2009).
Second, the emerging trend of state protectionism and global trade conflicts could have important implications for EU internal market governance. The ECJ Dataset tracks how the ECJ adjudicated cases involving economic freedoms, particularly the free movement of goods and the completion of the customs union. Historically, the ECJ played a central role in striking down national protectionist policies during the 1970s and 1980s while strategically avoiding head-on political confrontations (Phelan, 2019; Šadl, 2024). Similar legal battles may emerge as EU institutions and nation states seek to maintain the integrity of the internal market and national markets against the reenactment of trade barriers.
Third, the ECJ’s internal market judgments included in the ECJ Dataset have been consequential for the distribution of power between EU institutions and national governments. Key concepts, which changed this balance, such as implied powers, direct effect, proportionality, and the primacy of EU law, were minted in the internal market case law (Barnard, 2019; Tridimas, 2006). The ECJ Dataset allows scholars to examine whether the ECJ is still reinforcing EU legal integration with the same vigor and legal conceptual imagination, or shifting to a lower gear with a member state-friendlier approach. It highlights if the ECJ grants member states more policy leeway (López Zurita & Brekke, 2024; Zglinski, 2018) or, conversely, strengthens EU authority.
The 22 observed legal characteristics (“variables”; see section 2.3. Variables) are designed to depict and engage with long-term trends and contemporary themes. Eight variables are common to all policy areas and potentially to all judgments, while the remaining 14 are policy-specific, meaning that they are narrower and address legal issues of disputes typically attached to one or few policy areas.
The variables contain information about the parties to the case (such as variables: gender or legal status), the legal questions or issues raised in the dispute, the intensity of judicial review (variable: proportionality test), and the allocation of decision-making authority between the EU and the national levels (variables: deference; national regulatory autonomy). Importantly, the variables offer a concise yet nuanced overview of case outcomes – of what the ECJ decided, including decisions about the compatibility of national measures and policies with EU law. Crucially, this information cannot be directly collected from the official data repositories like InfoCuria (Court of Justice of the European Union, 2024c) or EUR-Lex (EUR-Lex, 2024) and has so far never been hand-coded.
The ECJ Dataset thus opens new opportunities for interdisciplinary research. Rather than providing definitive answers to specific research questions, it serves as a platform for developing new hypotheses and research questions in EU law, comparative law, and judicial behavior. Its focus is on the ECJ, examining legal outcomes and arguments to explore judicial power and strategic behavior. Researchers in judicial politics can benefit from the conceptual and methodological toolbox developed in the process of data collection, such as measures of legally complex concepts. Meanwhile, legal scholars can systematically examine the significance of legal concepts for legal practice and societal challenges over time and across policy areas.
Finally, the ECJ Dataset can also support larger comparative studies of judicial behavior (Engst & Gschwend, 2024). This aligns with the growing interest in this emerging field (Epstein et al., 2024). If combined with data on judges’ reports, citation patterns, or court composition, the ECJ Dataset could furthermore provide insights into peer effects, judicial specialization, and internal ECJ dynamics (Hermansen, 2020; Krenn, 2022). For example, it may reveal how seniority or the role of the ECJ’s President who assigns cases affects legal decisions (Šadl et al., 2022). These issues are especially relevant today, as judicial independence is under pressure worldwide, inviting interest in how judges resist political pressure (Ríos-Figueroa & Staton, 2014; van Dijk, 2024; Weinshall, 2024).
The following two parts of the article first document the data collection’s aims and process, describing the methodology and the hand-coded characteristics of the judgments. The second part illustrates how the hand-coded data can further a novel agenda, demonstrating its usefulness and broad applicability with select examples.
2. The ECJ Dataset
This section describes how the ECJ Dataset is structured, including the guiding principles of data collection, the organization of the ECJ Dataset, and the variables.
The ECJ Dataset includes information about the ECJ’s legal choices expressed in their judgments as legal tests, adopted interpretations, and outcomes, such as decisions about the compatibility of national law with EU law. Moreover, the data includes individual characteristics of the applicants. Because the ECJ never discloses its votes and deliberates behind closed doors, these tests, interpretations, and outcomes are a unique window into the mechanisms and the politics of judging.
2.1. Guiding Principles
Considering data infrastructure as a public good, an investment in shared research resources is overdue and sorely needed (Staton, 2024). This has guided the development of the present dataset.
The ECJ Dataset is a stand-alone dataset that contributes to this common effort and is fully compatible with the data available in the IUROPA CJEU Database. As a stand-alone dataset, using judgments as units of analysis, it can also be combined with meta-data available from the official repositories like EUR-Lex. The large-scale IUROPA project has previously collected information about the ECJ’s decisions and the actors involved in the proceedings, including information about cases, judges, parties, legal issues, submissions, or interventions of third parties in the proceedings and the national referring courts into the IUROPA CJEU Database. 6 This was a historic leap from the one-off data collections underpinning individual research projects. These have been often conducted by a single scholar or small research teams and tailored to specific research questions. The multi-user data platform was not driven by a single theory of judicial behavior or legal reasoning, legal priors about the importance of case law, or hypotheses.
Following Weinshall & Epstein’s (2020) criteria for common research resources and multi-user data collections, the ECJ Dataset is (1) aimed at solving real-world problems, (2) open and accessible, (3) reproducible and reliable, (4) sustainable and updatable. It is built as a foundation for present and future research needs. It also takes on board a call for interdisciplinarity, arguing that databases widely used for empirical research in courts and judicial politics – such as the US Supreme Court database (Spaeth et al., 2024) – need a more nuanced coding regime to be useful to scholars interested in law and legal doctrine (Shapiro, 2009). Such a nuanced regime would avoid the problems associated with simplification – for example, understating or overlooking important and potentially observable relationships while overstating or misidentifying results.
The 22 selected legal characteristics (see section 2.3. Variables) of the judgments address the concern of oversimplification, adding necessary information about the legal content. This list of variables was selected in a mutually informed bottom-up (reading of judgments) and top-down process (reading of EU law doctrine and literature). Satisfying criterion 1 above, the guiding features were legal relevance and political resonance to the policy area, how often a specific characteristic appeared in the judgments, and relevance for the decision-making of the ECJ (e.g., interpretation), as discussed in legal scholarship and political science literature. Such granular information and a better understanding of these concepts and tests can inspire novel research questions, hypotheses, and interdisciplinary agendas.
All information is open and accessible, satisfying criterion 2. 7 The codebook, the inter-coder reliability tests, and the transparent hand-coding process ensure that the data is reproducible and dependable (criterion 3). The hand-coding process is repetitive and proceeds in several rounds, requiring repeated fine-tuning of the codebook and the illustrations that help the coders and the potential users to understand the variables; occasional recoding following inter-coder reliability tests, and extensive training of research assistants. A hindering to the updatability of the dataset, and therefore the fulfillment of criterion 4, regards the cost involved in hiring experts in EU law to hand-code a growing body of judgments. Smaller scale projects may however limit themselves to updating individual variables. This dramatically reduces costs in case that the official maintenance of the ECJ Dataset ends abruptly.
Machine learning techniques, including natural language processing (NLP) and large language models (LLM), might offer new opportunities for cost-effective large-scale data collection. Recent literature illustrates the potential of these methods in the context of EU law (Lindholm, 2024), including the extraction of legal claims and case outcomes (Ribeiro De Faria et al., 2024) and the mining of legal arguments from case law (Habernal et al., 2024). While such automated approaches could partly reduce the costs of manual hand-coding, our attempts at replicating the hand-coded variables have so far yielded unsatisfactory results (Contini et al., 2022). 8 Even if the algorithms could correctly classify selected parts of legal texts, their performance proved subpar to hand-coded data by legal experts, and the data, unreliable.
Considering the progress in machine learning techniques, the ECJ Dataset can offer an indispensable training set for scholars interested in developing automated or semi-automated data collection approaches. Better training sets developed by legal experts can assist the development and the training of more sophisticated models on legal texts, which can deal with long text sequences (Guha et al., 2023; Nguyen et al., 2022; Yu et al., 2022).
2.2. Structure of the ECJ Dataset
The ECJ Dataset is organized in a table with individual judgments of the ECJ as rows (one per row) and the observed characteristics of the judgments (the variables) as columns. Each judgment has an identifier (ECLI number) assigned by the ECJ, as well as specific IUROPA identifiers to facilitate merging with other data tables in the IUROPA CJEU Database (The IUROPA Project, 2024).
Graduate students in EU law at the European University Institute in Florence, Italy, were tasked with reading the text of the judgment – including the facts, the legal framework, the preliminary questions or claims in direct actions, the reasoning of the ECJ, and the final ruling/adopted interpretation. They recorded the values for each variable in a graphical user interface, following a detailed coding protocol. 9
Common Variables Coded in the ECJ Dataset.
2.3. Variables in the ECJ Dataset
The ECJ Dataset includes 22 variables. Eight variables are common to all judgments in the ECJ Dataset and potentially any other judgment of the ECJ across all policy areas. We label these “common variables” (Table 1). They relate to the ECJ’s decision-making and legal content. An example is the variable measure_rejected, which refers to the ECJ’s decision (case outcome) that the disputed national measure, at issues in the case, is (in)compatible with EU law.
Policy-Specific Variables Coded Into the ECJ Dataset Relating to Judgements on the Free Movement of Persons.
Variables in the ECJ Dataset, Which are Relevant to the Free Movement of Goods and Establishment.
The TEU and the TFEU regulate various policy areas differently and the disputes in each of these areas have raised different legal problems throughout the period of observation (1952–2023). However, while every case is unique with each dispute being embedded in a specific political and economic context, the disputes within a single policy area are often comparable as they raise similar legal issues. As the freedom of establishment and the free movement of goods disputes raise very similar legal issues, the variables listed below apply to establishment and goods policy areas.
All variables allow for a high value (1) indicating that the observation of the given variable is true, and a low value (0) indicating that the observation is false. For example, value 1 in applicant_gender indicates that the applicant is female, whereas value 0 indicates that the applicant is male. A negative value (−1) indicates that the observation is not applicable; in applicant_gender, this means that the gender of the applicant could not be established, for example because of several applicants or if the applicant is a legal person (a company).
Another example can be made for the variable proportionality_test. While value 1 indicates that the ECJ conducted a proportionality test, and value 0 indicates that the ECJ could have done so, but decided not to. If the dispute or the preliminary questions do not concern a national measure or a measure adopted by the EU legislator, the ECJ does not nor cannot conduct a proportionality test, and the observation will be coded as negative (−1).
Finally, outcome of 2 for the direct_effect variable indicates that the ECJ has already acknowledged the direct effect of a provision of EU law and is reconfirming its choice in the case at hand, whereas a normal high value (1) indicates that the ECJ recognizes the direct effect of a provision for the first time. The distinction is consequential, as direct effect directly bypasses the national legislator, and a provision of EU law can override a provision of national law, which it conflicts with, or fills a legal lacuna in domestic law. Hence, the ECJ assumes the function of the legislator, and expands the scope of EU law.
3. Potential Insights of the ECJ Dataset
This section uses examples to demonstrate how the granularity of the legal information included in the ECJ Dataset furthers interdisciplinary research.
For example, the data sheds light on processes and developments in economic integration as well as broader societal questions of relevance beyond the EU. A case in point is the litigation success rate relative to gender and equal rights protection, as well as the question of who uses EU law and for what purpose. The latter is related to the theorizing of integration, especially the neo-functionalist (Burley & Mattli, 1993) and post-functionalist accounts (Hooghe & Marks, 2009).
The data also reveals the ECJ’s ability to identify and capitalize on the law-making opportunity presented by indeterminate Treaty provisions and incomplete political agreements. Harmonization (variable 14) implies that the EU has adopted binding legislation, circumscribing the ECJ’s choice to shape the area or follow the political consensus. In contrast, the ECJ is more independent from the legislators but also more politically exposed when interpreting the Treaty, and de facto legislating in lieu of the EU legislator, who remains bound by strict procedural requirements for Treaty change. In this sense, the ECJ’s interpretation would be irreversible in a political process. The variable observes if the ECJ uses its interpretive power, testing the separation of powers/legislative override theory and adding to the debate about the ECJ’s judicial activism (Dawson et al., 2024; Schmidt, 2018).
The type of applicants varies across the policy areas included in the ECJ Dataset. While legal persons dominate in the judgments about the free movement of services and freedom of establishment (71%) and goods (85%), most cases in free movement of persons’ disputes unsurprisingly involve natural persons as either claimants or defendants (86%). The preliminary reference procedure is non-adversarial and hence the distinction is rarely crucial. Rather, the decisive information is whether the ECJ decides in favor of private parties, captured by variable protection_individual (variable 13), as well as whether the ECJ finds national rules conflict with EU law and restrict free movement. These are captured by measure_rejected (variable 7).
3.1. Declining Share of Judgments Finding a Conflict Between National and EU Rules
Figure 1 displays the share of free movement of goods and establishment judgments in which the ECJ held that EU law precluded the disputed national measure in question, meaning that the disputed national measure was contrary to the Treaty or EU legislation (variable 7). It shows that the share of such judgments has been decreasing since the year 2000, dropping to an all-time low after 2010 and to less than 0.25 in 2020. The share of free establishment and free movement of goods judgments in which the ECJ concluded that the national measure was contrary to the Treaty or European legislation.
This trend can indicate, first, that the member states are increasingly complying with their free movement obligations. This interpretation would confirm the findings of Börzel (2001), Conant (2002) and the arguments of König and Mäder (2014) and Fjelstul and Carrubba (2018). Second, it might mean that the litigants are increasingly pursuing unfounded claims before the ECJ. This interpretation would incentivize a different debate about the diminishing legal quality or sound legal grounding of the litigants’ claims. It might also indirectly speak to the research about the incentive structures for litigation, access to justice, or legal mobilization. Those often focus on the frequency of litigation and structural barriers to litigation (Hermansen & Pavone, 2025, forthcoming; Passalacqua, 2021; Hofmann, 2025) but might find research into litigation success useful to text their claims with reliable data. Third, the trend could imply that the ECJ is increasingly seeking state-friendlier outcomes, and searching for diplomatic and politically viable solutions, (López Zurita & Brekke, 2024; Zglinski, 2018; Schroeder, 2023).
Figure 1 shows the share of judgments with a negative outcome for the national government regardless of the type of procedure. Further disaggregating the data by type of procedure – either the preliminary reference, or infringement of the Treaties, available from the IUROPA Database and the official repositories of case law like EUR-Lex, can open other kinds of inquiries. One possibility is examining whether the ECJ echoes the European Commission’s enforcement policy to “achieve more with less” (Kassim et al., 2017; Sarmiento, 2020). The European Commission adopted this policy in response to growing Euroscepticism, aiming to focus its resources to pursue significant and consequential Treaty violations (Kelemen & Pavone, 2023). As variable 7 considers the outcome of litigation rather than the mere launching of proceedings before the ECJ, the ECJ Dataset can identify and critically reflect on the Commission’s course of action.
3.2. Structural Changes in Free Movement of Persons Litigation
Figure 2 illustrates structural changes in litigation in the free movement of persons and EU citizenship domain. It shows that the share of judgments concerning workers or economically active persons moving across state borders, or who the ECJ characterized as such (variable 11), decreased from more than 70% in the early 1960s to less than 50% during the 2010s. The share of free movement of persons judgments over time where the ECJ rules in favor of private individuals (the blue line) and the share of judgments where the applicant is economically active (the red line).
This trend can be better grasped in its historical and political context. The Treaty of Rome (1957) designed the free movement of workers as part of the internal market. The EU legislator adopted a series of directives and regulations to support the free movement of workers and their family members. The ECJ expanded the free movement rights in citizen-friendly rulings in the 1990s and early 2000s (Blauberger et al., 2018). EU citizens increasingly started using their Union citizenship rights to move to another member state for reasons unrelated to work, for instance to study, accompany a family member, live, retire, or look for a job. The member states reacted by restricting their rights to move and limiting their entitlement to social transfers or the right to reunite with their family members who were not EU citizens (O’Brien, 2017). These broader mobility trends and the national reactions to them changed the legal issues that the ECJ dealt with.
Figure 2 moreover illustrates that the share of judgments where the ECJ ruled for the applicant claiming free movement rights (variable 13), decreased from more than 80% (blue line) in the 1960s to less than 60% after 2015 (also the year of an unprecedented migration crisis in Europe). This points to a possible structural change in the adjudication of cases regarding free movement of persons, comprising a change in the type of applicants litigating their free movement rights, and a change in the ECJ’s willingness to uphold their claims (Davies, 2018).
Digging into the relationship between these two trends leads us to the question of whether increased political pressure from the member state governments on the ECJ and the rights of economically inactive migrants are negatively correlated (Nic Shuibhne, 2015; Sankari & Šadl, 2017; Thym, 2015), or, alternatively, whether migrants claiming free movement rights have been making excessive or unreasonable demands (Davies, 2018).
3.3. Elucidating the ECJ’s Use of Proportionality
The ECJ Dataset supports a systematic investigation of the strategic use of legal tools and principles. For instance, scholars argue that proportionality or balancing between legally protected interests allows the ECJ to disguise policy choices as technicalities.
10
Figure 3 presents the ECJ’s use of proportionality in all judgments, showing an increase in two policy areas – free movement of persons and freedom of establishment – and a decrease in one – free movement of goods. This can potentially corroborate the claims raised in recent literature that proportionality is a form of deference to the political actors (Zglinski, 2018) or a form of abdication of judicial constitutional authority and responsibility (Aleinikoff, 1987). The share of judgments with proportionality review of national measures over time. The red line shows free movement of goods, the green line free movement of persons, and the blue line freedom to provide services and freedom of establishment.
4. The Benefits and Costs of Collecting Granular Data
Explaining the choices that the judges of the ECJ have been making since 1952 is difficult without high-quality and freely accessible data. The implications of these choices have increased in parallel with the expansion of the ECJ’s jurisdiction and the EU’s competences (areas which the EU can regulate by adopting binding legislation) in policies initially reserved for nation states, such as taxation, immigration control, national security, or labor market policy. The enlargement of the Union’s geographical and political space means that the judgments rendered in disputes between the member states and the EU institutions, as well as the binding interpretations of EU law following the preliminary references of the national courts, affect thousands of policy makers and millions of citizens.
Until recently, the ECJ has resisted calls for openness (Alemanno & Stefan, 2014). Even with the modernization of the institution (like the streaming of oral hearings and a YouTube channel), its decision-making process remains opaque. It is still characterized by the absolute secrecy and non-disclose of dossiers, the pleadings of the parties in the proceedings, 11 judicial deliberations, individual votes, the collegial nature of the judgments, the case assignment strategy to the reporting judges by the President of the ECJ, the unwritten criteria of chamber compositions and rotations, and uninformative reasoning (Conway, 2012; Jacob, 2014; Lasser, 2009). This makes the collected ECJ data presented in this article a crucial and indispensable research resource, optimal for prying open the ECJ’s choices and the impact of non-legal factors on the law – understood as the characteristics of the EU legal order, its guiding principles, the protection of individual rights, and the outer limits of EU competences.
Researchers aspiring to upgrade the legal aspect of the information about the ECJ’s judgments might consider two concrete aspects. Firstly, the process of selection, operationalization, and hand-coding of legal features is long and complex. The initial selection of legally interesting features often proves overly optimistic; many interesting aspects of a case do not necessarily lend themselves to pre-established and somewhat rigid categories designed to fit an unspecified number of cases. Many of the most interesting legal questions are rare and occur only in a handful of disputes – which hardly merits the extensive effort required to define the variables and range of possible outcomes. Categories can also prove over- or under-inclusive, owing i.a. to the limited data access allowed by the ECJ. Weinshall and Epstein (2020) thus wisely caution against data exuberance as a rule. However, to study doctrinal change, as Shapiro (2009) suggests, researchers can add their own coding protocols and supplement the foundational ECJ Dataset to achieve increasing returns for the broader research community.
Secondly, legal features that are common to all judgments are in principle easy to identify based on typical textual expressions and so lend themselves to automation – but even these can prove complicated. An automated algorithm cannot analyze the subtext: if the ECJ says that the national court should decide the matter, is it really being deferential or is it just cheap talk? Such issues are practically irresolvable; legal scholars will forever disagree about the legal importance of a case and the ECJ’s motives. The preferred option is to allow for the greatest possible transparency of the coding process and the priors of the researchers collecting the data. In Weinshall & Epstein’s (2020) framework, this translates into reproducible and reliable data, which can sustain present and future research.
5. Epilogue
The ECJ Dataset provides novel insights into the courts’ shifting role in shaping the central areas of law and politics. The ECJ’s six-decade long trajectory captured in the ECJ Dataset reflects the dynamic nature of judicial authority and EU law in the institutional and political transformations. The internal market remains central to the EU project, and the ECJ, the key political actor. More broadly, studying the ECJ’s internal market case law speaks to legal and political scholarship asking how international courts maintain political power and institutional independence. The data is a resource for two research communities. Legal researchers can help situate their research in a broader empirical context. Scholars of judicial behavior can begin to integrate legal complexity into their research.
Footnotes
Acknowledgments
We would like to thank Sabine Mair, Irene Fernández Otero, Hubert Bekisz, Greta Spano and Eftychia Constantinou for their excellent research assistance.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Sapere Aude Research Leader Grant, Independent Research Fund Denmark, grant number 8046-00019A; Danish National Research Foundation, grant number DNRF169.
