Sage Journals: Discover world-class research

Abstract

Statistics on expected goals and expected points in football provide insights into teams’ expected performance based on data about shots. Some teams exceed expectations, while others fall short of this benchmark. In this preregistered study, we employ random-effects meta-analysis to decompose the deviation from expected performance into luck and skills, using data from the top seven European football leagues for men over three seasons. Our results indicate that approximately 40% of the variation in over-/underperformance during a league season is attributable to skills, while the remaining 60% is due to luck. Over a season of 38 matches, these estimates imply that the standard deviations in points attributed to skills and luck are roughly 6 points and 7 points, respectively. We demonstrate the significance of luck through simulations, indicating that, on average, it affects league rankings by 1.8 positions and showing that luck plays a decisive role in determining end-of-season outcomes.

Keywords

European football skills luck expected goals (xG)expected points (xPTS)meta-analysis

Introduction

Engaging in and watching sports is a favorite pastime for many people, and the sports industry has become a large and rapidly growing global sector (Riot et al., 2018). The widespread interest, coupled with its economic relevance, has spurred an increasing academic literature focused on sports. Scholars have explored various phenomena, such as whether there is a “hot hand” effect in basketball (Gilovich et al., 1985; Miller & Sanjurjo, 2018), whether coaches affect outcomes (Berry & Fowler, 2021), whether game theory can predict behavior during football penalty kicks (Chiappori et al., 2002), whether football players and coaches exhibit reference-dependent behavior (Bartling et al., 2015), whether there is ethnic discrimination in refereeing and wage-setting (Parsons et al., 2011; Price & Wolfers, 2010; Szymanski, 2000), and whether there is nationalistic favoritism in sports judging (Emerson et al., 2009; Sandberg, 2018).

Football is the most popular sport worldwide, with the European football market generating over £35.3 billion in revenue during the 2022/23 season (Deloitte Sports Business Group, 2024). Anyone familiar with football recognizes that luck can play a decisive role in individual games (Skinner & Freeman, 2009), and the implied unpredictability of match results may contribute to the sport's popularity (Neale, 1964; Pawlowski & Anders, 2012; Rottenberg, 1956). However, determining the precise impact of skills versus luck on football outcomes remains elusive, even as recent research begins to address this question (Brechot & Flepp, 2020; Gauriot & Page, 2019; Sarkar & Kamath, 2023; Wunderlich et al., 2021). There is also a related body of literature on the importance of luck versus skills in sports and games more generally. For example, Gilbert and Wells (2019) examined the role of luck in individual games within major US professional team sports, including Major League Baseball (MLB), the National Basketball Association (NBA), the National Football League (NFL), and the National Hockey League (NHL), as well as for skills-based board games, including chess and Go. Their findings suggest that luck played the most significant role in MLB games and was least important in NFL games, while they found that luck was more influential in chess than in Go. Fioravanti et al. (2023) found that luck, ability, and effort play nearly equal roles in explaining performance variations in English Premiership Rugby. Levitt and Miles (2014) highlighted the importance of skills in online poker, although they did not quantify the role of luck. Additionally, the literature on competitiveness in sports is related to luck, as competitiveness can be modeled as the likelihood of an underdog winning (Ben-Naim et al., 2006).

Football experts typically argue that luck may play a role in individual games, but over the course of a league season with many matches, the best team consistently emerges as the winner. One way to quantify the importance of luck over a league season is to compare actual outcomes to outcomes expected based on actual match events. The availability of detailed statistics about football games has greatly enhanced the opportunities for making such comparisons. It is now common to report expected goals (xG), which describes the number of goals a team is expected to score based on the goal-scoring opportunities they create in a match (Anzer & Bauer, 2021; Rathke, 2017; Robberechts & Davis, 2020). Similarly, the expected goals of the opponent reflect the number of goals a team is expected to concede during a game. Expected goals are based on data on the position of shots as well as the location of defenders and goalkeepers; however, the exact data and modeling techniques used vary across data providers. Statistics on expected goals can also be converted into the number of points a team is expected to receive in a match and, consequently, the expected league position throughout a season. Expected goals and points can be used as performance indicators, as benchmarks for defining over- and underperformance, or as information to forecast results in future games. See Heuer and Rubner (2009, 2012), Heuer et al. (2010), Klemp et al. (2021), and Wunderlich (2024) for further discussions of performance indicators and forecasting of football matches.

While it is tempting to attribute deviations from expected goals and points in individual games or over a season to luck (Mauboussin, 2012; Sobkowicz et al., 2020), it cannot be ruled out that some teams are more skilled at converting shots into goals or systematically better at preventing goals from shots by the opponent (unless the xG model used is perfect in predicting goals). A recent study by Sarkar and Kamath (2023) compares the difference between actual and expected goals and points for the top and bottom six ranked teams in the top five football leagues in Europe. Surprisingly, their findings suggest that luck generally does not influence league positions. We revisit this question using a different methodology and including all teams irrespective of their league position.

We delineate the variation in over- and underperformance into skills and luck using a random-effects meta-analysis. The random-effects meta-analytic model is specifically designed to decompose the variability across the pooled estimates into random sampling variation and heterogeneity (Borenstein et al., 2010; Higgins et al., 2009; Riley et al., 2011), making it a suitable tool for analyzing over- and underperformance in football results. Heterogeneity measures the systematic variation in deviations from expected performance across teams throughout a season and is indicative of variability attributable to skills. As an example of between-team variability in overperformance, consider a team that consistently outperforms expected points due to strong finishing abilities or effective defensive organization. On the other hand, luck accounts for variations in overperformance that arise from factors such as referee decisions, deflected shots resulting in goals or own goals, or the variability in a particular player's finishing ability. Wunderlich et al. (2021), for instance, found that randomness played a role in nearly 50% of all goals scored in the Premier League. For further discussion and examples of how luck affects goal scoring in football, see Lames (2018). Additionally, the hot hand fallacy, i.e., the phenomenon of winning and losing streaks being attributed being due to skills rather than luck, highlights the challenges humans face in distinguishing between random and systematic influences on sports outcomes (Gilovich et al., 1985; Heuer & Rubner, 2009; Miller & Sanjurjo, 2018).

We used data provided by Football xG (https://footballxg.com/) encompassing the three seasons 2021/22, 2022/23, and 2023/24 for the seven European top leagues for men based on the UEFA club coefficients (bit.ly/45UvErs): Premier League (ENG), La Liga (ESP), Ligue 1 (FRA), Bundesliga (GER), Serie A (ITA), Eredivisie (NLD), and Liga Portugal (PRT). The xG data obtained from Football xG is based on the position of shots and the location of defenders and goalkeepers (this information was obtained from personal communication with the data provider).

To set the stage, Figure 1 plots the average difference between teams’ actual points and teams’ expected points per game, referred to as points overperformance (PP), for each of the seven leagues in each of the three seasons (similarly, Online Appendix Figures 1–3 illustrate the distributions of three more outcome measures entering our analyses below; see also Online Appendix Tables 1–8 for descriptive statistics overall and per league). The data in Figure 1 indicates that there is substantial dispersion in deviations from expected points across teams. In the 2023/24 season, the most overperforming team in the Premier League (ENG) was Manchester City, achieving, on average, 0.337 points more than expected per game; Sheffield United was the team that underperformed the most, finishing with 0.464 points fewer than anticipated per game. The difference of 0.802 points between the two extremes of the distribution is substantial, given that the 20 Premier League teams scored 1.392 points per match on average (sd = 0.532). On the season level, with the league season comprising 38 games, the difference between the most overperforming and the most underperforming team amounts to 30.4 points, relative to an average season-level points score of 52.9 (sd = 20.2) in the 2023/24 Premier League. The observation that Manchester City was the most overperforming team, while Sheffield United was the most underperforming team in the 2023/24 season may indicate a general trend where strong teams overperform and weak teams underperform, which we report some evidence for in Section “Simulations illustrating the impact of luck on league positions (not preregistered)”.

Figure 1.

Over- and Underperformance in Points per Season (Points Overperformance).

Despite noticeable variability in the distribution of points overperformance across seasons and leagues, it stands out that the variability in deviations for expected points is substantial. Our research aims to quantify the extent to which disparities in over- and underperformance across teams can be attributed to skills or luck. This, in turn, allows us to quantify the role of luck in determining end-of-season outcomes, such as winning the league or qualifying for the Champions League. We focus solely on luck as an explanation for teams’ performance relative to expectations instrumentalized in terms of expected goals (xG) and expected points (xPTS) based on shots, while neglecting the randomness that leads to shots. Our research question is not only intriguing on its own but also significant for managers and executives in the football industry and for understanding both the potential and limitations of using xG data for forecasting future football games and setting betting odds (e.g., Dmochowski, 2023; Spann & Skiera, 2009; Wunderlich & Memmert, 2018). If there is heterogeneity in overperformance, it is evident that xG statistics will be systematically biased as a forecasting tool unless they are combined with additional data that explains the variability. Heterogeneity can be seen as the upper bound for enhancing the explanatory power of xG models. Recognizing the role of randomness in overperformance is also crucial for evaluating the performance of football teams and making informed managerial decisions, such as whether to replace a coach or buy or sell players. Misinterpreting random variation as being due to skills can lead to poor decision-making, such as in the gambler's fallacy (Cowan, 1969; Sundali & Croson, 2006; Tversky & Kahneman, 1971, 1974).

Materials and Methods

Below, we outline the data and variables used and provide further details about the analyses and tests. Preregistration of a detailed analysis plan prior to accessing data mitigates the scope for questionable research practices (John et al., 2012; Nelson et al., 2018; Nosek et al., 2018; Simmons et al., 2011). Prior to obtaining the data used in our study from Football xG (https://footballxg.com/), we posted an analysis plan at the Open Science Framework (OSF) detailing the study's design and all planned analyses and tests (osf.io/9p8qr). To verify receipt of the data after posting our PAP, we have added an addendum to the PAP with a letter from Football xG confirming the date they sent us the data. The only exception to this is that Football xG, prior to posting the PAP, sent us an example of the data structure with data for one Premier League team, Arsenal, for the 2023/24 season. We report all our preregistered tests and analyses and transparently report all deviations from the PAP. The only deviations are that we added some not preregistered exploratory analyses described in detail in a separate section below and that there was an error in the PAP regarding the number of observations in the 2021/22 and 2022/23 seasons, which has no impact on the implementation of the analyses (details are provided in the subsection Data and Variables below).

Data and Variables

We use data provided by Football xG (https://footballxg.com/) for the three seasons 2021/22, 2022/23, and 2023/24 for seven European top leagues for men: Premier League (ENG), La Liga (ESP), Ligue 1 (FRA), Bundesliga (GER), Serie A (ITA), Eredivisie (NLD), and Liga Portugal (PRT). We use the following variables, measured for each team on the game level, to construct our outcome variables: (i) goals scored (G); (ii) expected goals scored (xG); (iii) goals against (GA; equivalent to the opponent's G); (iv) expected goals against (xGA; equivalent to the opponent's xG); (v) points (PTS); and (iv) xPTS. Based on these data, we construct the following variables (for each team and each game) used as outcome variables in our analyses (while these variables measure over- and underperformance, we refer to them as “overperformance” for simplicity):

Offensive overperformance (OP) = goals scored (G) − expected goals (xG). Offensive overperformance, sometimes referred to as goals above expectation (GAX) (e.g., Baron et al., 2024), measures whether a team scored more or fewer goals than expected in a game; positive (negative) values indicate overperformance (underperformance).

Defensive overperformance (DP) = expected goals conceded (xGA) − goals conceded (GA). Defensive overperformance measures whether a team conceded more or fewer goals than expected in a game; positive (negative) values indicate overperformance (underperformance).

Goals overperformance (GP) = offensive overperformance (OP) + overperformance (DP). Goals overperformance measures whether the goal difference (G − GA) of a team was larger or smaller than the expected goal difference (xG − xGA); positive (negative) values indicate overperformance (underperformance).

Points overperformance (PP) = points (PTS) − expected points (xPTS). Points overperformance measures whether a team received more or fewer points than expected; positive (negative) values indicate overperformance (underperformance).

For each of the 20 teams in the Premier League (ENG), La Liga (ESP), and Serie A (ITA), there are 38 observations on the above variables for every season, as a season consists of 38 rounds; for each of the 18 teams in Bundesliga (GER), Eredivisie (NLD), and Liga Portugal (PRT), there are 34 observations on the above variables for every team. For Ligue 1 (FRA), there are 20 teams per season in the seasons 2021/22 and 2022/23, and 38 observations on the specified variables for every team, while there are 18 teams and 34 observations on the same variables for each team in the 2023/24 season. In the PAP, we mistakenly stated that there were 18 teams per season in Ligue 1 for all three seasons. As a result, we reported an incorrect number of observations for the 2021/22 and 2022/23 seasons and the overall total in the study. Thus, our sample involves 4,728 observations per season (from 2,364 games) for the 2023/24 season and 4,876 observations per season (from 2,438 games) for the 2021/22 and 2022/23 seasons. The total number of observations for the three seasons is 14,480 observations (from 7,240 games).

Preregistered Analyses and Tests

We test the preregistered hypotheses below, which are divided into primary hypothesis tests, secondary hypothesis tests, preregistered exploratory analyses, and robustness tests. In testing these hypotheses, we interpret two-sided p-values < 0.05 as “suggestive evidence” and two-sided p-values < 0.005 as “statistically significant evidence,” as suggested by Benjamin et al. (2018). The reason for using this more conservative threshold for “statistically significant findings” is to reduce the risk of false positives and to communicate results in a responsible way. As noted by Benjamin et al. (2018), a p-value of 0.05 does not represent strong evidence in favor of the tested hypothesis.¹

We use the restricted maximum likelihood estimator (Viechtbauer, 2005) to estimate random-effect meta-analyses using the metafor package (v-4.6.0) (Viechtbauer, 2010) in R (v-4.3.3) (R Core Team, 2022). The estimated confidence intervals of our heterogeneity measures τ² and I² are based on the Q-profile method (Viechtbauer, 2007), implemented using the confint() function shipped with the metafor package (the heterogeneity measure τ² and its confidence interval are reported as τ and its confidence interval after taking the square root of the confidence interval bounds for τ²). The I² measure can be interpreted as the fraction of the variation in the outcome measure that is due to skills, and the remaining fraction is due to luck. The τ measure can be interpreted as the standard deviation in skills across teams; i.e., for the points overperformance outcome measure, it represents an estimate of the standard deviation in points across teams in a game that is due to differences in skills.

For the random-effects meta-analyses in the primary and secondary hypothesis tests described below, we have 132 team-level observations (including the means and standard errors of the mean of the outcome variables) per season for the 2023/24 season and 134 team-level observations per season for the 2021/22 and 2022/23 seasons. For the sub-group tests of individual leagues in the preregistered exploratory analyses below, we have 20 team-level observations for Premier League (ENG), La Liga (ESP), and Serie A (ITA) for all three seasons, 18 team-level observations for Bundesliga (GER), Eredivisie (NLD), and Liga Portugal (PRT) for all three seasons, and 20 team-level observations in the seasons 2021/22 and 2022/23 and 18 team-level observations in the 2023/24 season for Ligue 1 (FRA).

Preregistered Hypothesis Tests

Our analysis involves two primary hypothesis tests and two secondary hypothesis tests. The four tests aim to evaluate whether there is heterogeneity, specifically whether the variation across teams exceeds what would be expected due to pure randomness. Consequently, heterogeneity reflects the part of the overall variability that is attributable to skills rather than chance. The four tests share the same methodology but differ in terms of the dependent variable. Particularly, we aggregate the team-level means (and the associated standard errors) of the dependent variable (goals overperformance, points overperformance, offensive overperformance, or defensive overperformance) in a random-effects meta-analysis across leagues for each season; i.e., we estimate one random-effects meta-analysis comprising all seven leagues for each of the three included seasons. The particular hypothesis is evaluated based on the results of the corresponding Q-test. As measures of the degree of heterogeneity, we report τ and I², along with their 95% confidence intervals. We hypothesized that there would be heterogeneity across teams for all four outcome measures.

Primary hypothesis 1: There is heterogeneity in goals overperformance across the included teams.

Primary hypothesis 2: There is heterogeneity in points overperformance across the included teams.

Secondary hypothesis 1: There is heterogeneity in offensive overperformance across the included teams.

Secondary hypothesis 2: There is heterogeneity in defensive overperformance across the included teams.

Preregistered Exploratory Analyses

We also conduct several exploratory analyses, which are described below. Particularly, we revisit our primary and secondary hypotheses separately for each league. These exploratory analyses carry little weight, and the statistical power is substantially lower. Specifically, we conduct the heterogeneity tests pertaining to primary hypotheses 1 and 2 and secondary hypotheses 1 and 2 separately for each of the seven leagues in each of the three seasons, involving a total of 21 random-effect meta-analyses for each of the four exploratory analyses.

Exploratory analysis 1: Separate tests per league of primary hypothesis 1.

Exploratory analysis 2: Separate tests per league of primary hypothesis 2.

Exploratory analysis 3: Separate tests per league of secondary hypothesis 1.

Exploratory analysis 4: Separate tests per league of secondary hypothesis 2.

Robustness Tests

As a robustness test for all the primary, secondary, and exploratory hypothesis tests listed above, we estimate ordinary least squares regressions with the hypothesis test's outcome variable entering as the dependent variable and team fixed effects entering as the independent variables. The regressions are estimated at the game-per-team level (n = 4,876 for 2021/22 and 2022/23, and n = 4,728 for 2023/24 in the robustness tests of the primary and secondary hypotheses) separately for each season, with standard errors clustered at the game level (n_c = 2,438 in 2021/22 and 2022/23, and n_c = 2,364 in 2023/24, with two observations per cluster). The robustness tests for the presence of heterogeneity in overperformance measures involve conducting Wald tests to determine whether the team fixed effects are jointly significant. We report the regressions’ R² as a measure of the fraction of the variation explained by heterogeneity, although it is not comparable in terms of magnitude to the I² estimates in the meta-analytic tests (which are based on team-level rather than game-level observations). We also report the 95% confidence interval of R² based on the approximation suggested by Olkin and Finn (1995; also see Cohen et al., 2003), implemented in R's psychometric package.

Non-Preregistered Exploratory Analyses

We complement our preregistered analyses by reporting some results and analyses that were not preregistered to facilitate the interpretability of our findings in terms of the implications of the estimated magnitude of luck on outcomes relative to expectations. Since these analyses were not preregistered, results should be considered exploratory and interpreted with caution.

Descriptive Results on the Standard Deviation in Overperformance Due to Lucks

We preregistered to report the estimated heterogeneity (τ), quantifying the between-team standard deviation in over- and underperformance attributable to skills. The remaining part of the overall variation in overperformance measures, ν, is due to luck, i.e., ν² = σ² + τ². To complement the reporting of τ, we report the standard deviation attributable to luck (σ) based on the random-effects meta-analysis alongside its 95% confidence interval. The 95% confidence intervals for σ are based on the derivation of confidence intervals for standard deviations (Sheskin, 2011). Note that the reporting of the standard deviation attributable to luck (σ) was not preregistered but follows directly from our preregistered estimates of I² and τ.

Simulations Illustrating the Impact of Luck on League Positions

To demonstrate the effect of luck on teams’ end-of-season league ranking, we add simulation-based results. As the influence of luck will depend on the association between expected points and skills, we first estimate meta-regressions for each season, testing whether the team-level points overperformance is associated with team-level expected points per season. To illustrate the importance of luck, we conduct the following simulation separately for each league and season. Our starting point is the sum of expected points per team in each season. To account for heterogeneity in over- and underperformance, we determine the predicted performance relative to expectations based on the meta-regressions described above, multiplying the predicted value by the number of games per season. The ranking based on the adjusted expected points serves as the skill-based end-of-season league position table used as the benchmark in the simulation exercise. The adjustment based on the meta-regression amplifies the difference in expected points between teams while assuming that there is no additional heterogeneity across teams with the same number of expected points. Importantly, however, the adjustment of expected points does not affect the skills-based ranking as such. In each simulation run (k = 10,000), every team randomly draws additional points due to luck from a normal distribution with zero mean and a standard deviation equal to the estimated standard deviation attributable to luck (σ) in points overperformance for the particular season (i.e., the σ estimated as part of primary hypothesis 2 below, multiplied by the number of games per season, which translates into 6.629 (5.931) in 2021/22, 6.650 (5.950) in 2022/23, 6.733 (6.025) in 2023/24 for league seasons involving 38 (34) games, respectively). After adding the random draw to teams’ expected points, we compose a new fictional league ranking for each team and determine the absolute change in a team's position relative to the team's skills-based ranking (e.g., if a team was placed 9th on the skills-based ranking and 11th on the fictional ranking, this constitutes a change of 2 ranking steps). The absolute changes in ranking steps are then averaged across the 20 (or 18) teams per league season. The drawing of additional points, composing the fictional ranking, and determining absolute changes in rank positions were repeated in each of k = 10,000 simulation runs. The means across simulation runs for each of the three seasons constitute our results. To construct three additional outcome measures, we additionally record for each run (i) whether the top-ranked team changed, (ii) whether the four top-ranked teams changed, and (iii) whether the three bottom-ranked teams changed. We report the mean of these binary outcomes for each season, interpreted as the probability that luck changes (i) the winning team, (ii) the top four teams that qualify for the Champions League (or the Europa League or the Conference League for some of the spots in some of the leagues), and (iii) the bottom three teams that face relegation (or proceed to a qualifying round for not being relegated in some of the leagues).

Results

Preregistered Primary Hypothesis Tests

Our analyses focus on two primary outcome measures, as defined above: goals overperformance (GP), which measures the difference between the actual goal difference and the expected goal difference in a game, and points difference (PP), which measures the difference between actual points and expected points in a game. In our two primary hypotheses, we test if there is systematic variation in the two outcome measures – heterogeneity – across teams during a season, and we estimate the extent of this variability using random-effects meta-analysis. In the meta-analytical context, heterogeneity manifests itself in the effects being more different from one another than one would expect due to sampling errors alone (Borenstein et al., 2010; Higgins et al., 2009). Translated to our application, heterogeneity implies that skills matter for over- and underperformance. To test our two primary hypotheses, we estimate random-effects meta-analytic models, pooling the mean outcome per team and season. We aggregate teams across all seven leagues but estimate separate meta-analyses for each season.

The results are illustrated in Figures 2 and 3; detailed results are provided in Table 1. We report statistically significant heterogeneity (p < 0.005) according to Cochran's Q-test in all three seasons for goals overperformance and two out of three seasons for points overperformance, with suggestive evidence (p = 0.006) for heterogeneity in points overperformance in the 2021/22 season. Hence, overall, we find relatively strong support for our two primary hypotheses. The degree of heterogeneity, as measured by I², varies from 29.3% to 38.0% for goals overperformance and from 28.7% to 55.4% for points overperformance across the three seasons (see Figure 2). The average I² across the three seasons is 33.4% for goals overperformance and 43.8% for points overperformance, indicating that approximately 40% of the variability in overperformance relative to expectations is due to skills, while around 60% is attributable to luck.

Figure 2.

Heterogeneity in the Four Overperformance Measures per Season.

Figure 3.

Luck vs. Skills in Goals Overperformance and Points Overperformance per Season.

Table 1.

Variance Component Estimates for the Four Overperformance Measures for Each Season, Pooled Across All Leagues (Preregistered Primary and Secondary Hypothesis Tests).

(a) Season 2021/22
Measure	GP	PP	OP	DP
Luck (σ)	0.222 [0.197, 0.251]	0.174 [0.155, 0.197]	0.152 [0.135, 0.172]	0.153 [0.136, 0.173]
Skills (τ)	0.143 [0.069, 0.193]	0.111 [0.040, 0.138]	0.083 [0.033, 0.130]	0.079 [0.028, 0.130]
I²	29.3% [8.7, 43.1]	28.7% [4.9, 38.4]	22.9% [4.5, 42.2]	21.0% [3.2, 41.9]
Q_df-test	Q₁₃₃ = 184.0 p = 0.002	Q₁₃₃ = 177.1 p = 0.006	Q₁₃₃ = 174.5 p = 0.009	Q₁₃₃ = 172.3 p = 0.012
(b) Season 2022/23
Measure	GP	PP	OP	DP
Luck (σ)	0.226 [0.202, 0.256]	0.175 [0.156, 0.198]	0.153 [0.136, 0.173]	0.154 [0.137, 0.174]
Skills (τ)	0.159 [0.092, 0.213]	0.166 [0.118, 0.200]	0.081 [0.027, 0.128]	0.102 [0.053, 0.138]
I²	32.9% [14.2, 47.0]	47.2% [31.1, 56.6]	22.1% [2.9, 41.2]	30.6% [10.5, 44.7]
Q_df-test	Q₁₃₃ = 196.2 p < 0.001	Q₁₃₃ = 251.6 p < 0.001	Q₁₃₃ = 171.9 p = 0.013	Q₁₃₃ = 187.8 p = 0.001
(c) Season 2023/24
Measure	GP	PP	OP	DP
Luck (σ)	0.251 [0.223, 0.284]	0.177 [0.158, 0.201]	0.167 [0.149, 0.189]	0.168 [0.149, 0.190]
Skills (τ)	0.196 [0.196, 0.131]	0.197 [0.146, 0.227]	0.127 [0.087, 0.177]	0.084 [0.015, 0.137]
I²	38.0% [21.4, 51.9]	55.4% [40.4, 62.2]	35.7% [21.4, 52.8]	20.2% [0.8, 40.0]
Q_df-test	Q₁₃₁ = 210.3 p < 0.001	Q₁₃₁ = 307.8 p < 0.001	Q₁₃₁ = 208.4 p < 0.001	Q₁₃₁ = 166.0 p = 0.021

Notes: The table shows estimates for the within-team variance, σ, attributable to luck, and the between-team variance, τ, attributable to skills alongside the estimate of I² and the results of the associated Cochran's Q-test of heterogeneity for (i) goals overperformance (GP), (ii) points overperformance (PP), (iii) offensive overperformance (OP), and (iv) defensive overperformance (DP) for each of the three seasons, pooled across the seven top leagues in Europe: Premier League (ENG), La Liga (ESP), Ligue 1 (FRA), Bundesliga (GER), Serie A (ITA), Eredivisie (NLD), and Liga Portugal (PRT). 95% confidence intervals are reported in brackets. Estimates are based on random-effects meta-analysis using the restricted maximum-likelihood estimator for τ; the 95% confidence intervals of σ are determined as the confidence interval of a standard deviation (Sheskin, 2011) (reporting σ and its 95% confidence interval was not preregistered).

The estimated between-team standard deviation, τ, varies between 0.143 and 0.196 for goals overperformance, with a mean of 0.166 across the three seasons; for points overperformance, τ estimates range from 0.111 to 0.197, with a mean of 0.158. These estimates suggest that the standard deviation in performance relative to expectations among teams is 0.166 goals and 0.158 points per game. The remaining variability in overperformance can be attributed to luck. The estimated within-team standard deviation, σ, averaged over the three seasons, varies between 0.222 and 0.251, with a mean of 0.232 for goal difference, while σ estimates for points overperformance range from 0.174 to 0.177, with a mean of 0.175. Overall, the estimated variability attributed to luck turned out to be very robust across the three seasons, while the dispersion in overperformance due to skills was slightly more divergent. Note that we did not preregister the reporting of the standard deviation due to luck (σ); however, it follows directly from our preregistered estimates of I² and τ. The 95% confidence intervals for σ are based on the derivation of confidence intervals for standard deviations (Sheskin, 2011).

Preregistered Secondary Hypothesis Tests

In two secondary hypotheses, we focus on the components that constitute goals overperformance: offensive and defensive overperformance. Offensive overperformance (OP) is defined as the difference between the actual number of goals scored and expected goals; defensive overperformance (DP) is defined as the difference between expected goals conceded and the actual number of goals conceded. We hypothesized that part of the variability in offensive and defensive overperformance is systematic (heterogeneity). The two hypotheses are tested in the same way as the primary hypotheses above. The results are illustrated in Figures 2 and 4; detailed results are provided in Table 1. For offensive overperformance, Cochran's Q-test indicates statistically significant evidence (p < 0.005) of heterogeneity in the 2023/24 season and suggestive evidence (p < 0.05) in the remaining two seasons. For defensive overperformance, we report statistically significant evidence of heterogeneity in the 2022/23 season and suggestive evidence in the 2021/22 and 2023/24 seasons. Thus, overall, we find support for our two secondary hypotheses, suggesting that the variability in the two primary outcome measures (i.e., goals and points overperformance) is driven by skills in realizing goal opportunities and preventing opponents’ scoring opportunities alike.

Figure 4.

Luck vs. Skills in Offensive Overperformance and Defensive Overperformance per Season.

The estimated I² for offensive overperformance ranges from 22.1% to 35.7% across the three seasons, with an average of 26.9%, and from 20.2% to 30.6%, with an average of 23.9%, for defensive overperformance (see Figure 2). On average, about 25% of the variation in offensive and defensive overperformance can be attributed to skills, while the remaining variability is due to luck. This is somewhat lower than for goals overperformance and points overperformance, but the confidence intervals in the I² estimates are relatively wide. For offensive overperformance, the estimated τ varies between 0.081 and 0.127 across the three seasons, with an average of 0.097; for defensive overperformance, τ ranges from 0.079 to 0.102, with an average of 0.088. These estimates suggest that the standard deviation in over-/underperformance across teams due to skills, on average, is 0.097 in terms of goals scored per game and 0.088 in terms of goals conceded per game. The remaining dispersion in teams’ performance relative to expectations can be attributed to luck. The standard deviation due to luck, σ, varies between 0.152 and 0.167, with a mean of 0.157 for offensive overperformance, and between 0.153 and 0.168, with a mean of 0.158 for defensive overperformance. As for the primary outcomes, σ estimates turned out to be robust across the three seasons, whereas τ estimates are somewhat more variable. Note that we, as stated above, did not preregister the reporting of σ; however, it follows directly from our preregistered estimates of I² and τ.

Preregistered Exploratory Analyses: Subgroup Analyses

For completeness, we revisit the primary and secondary hypotheses separately for each league in exploratory analyses. The statistical power in these tests is obviously lower, and the results should be interpreted with caution. These results are reported in Online Appendix Tables 9–15. The point estimates of heterogeneity are consistent with the results reported above, but the estimates are less precise due to the lower sample sizes. For goals overperformance, I² varies between 0.0% and 57.3% across league seasons, τ varies between 0.000 and 0.309, and σ varies between 0.195 and 0.281. For points overperformance, I² varies between 0.0% and 74.0% in the different estimates, τ varies between 0.000 and 0.301, and σ varies between 0.164 and 0.197.

Preregistered Robustness Tests

As robustness tests of all the primary, secondary, and exploratory hypotheses tested above, we estimate ordinary least squares regressions on the game-per-team level, with the hypothesis test's outcome variable entering as the dependent variable and team fixed effects entering as the independent variables. Standard errors are clustered at the game level, and the regression models are estimated separately for each season. To test for heterogeneity, we carry out a Wald test of whether the team fixed effects are jointly significant; the extent of heterogeneity is quantified in terms of the regression's R². These results, reported in Table 2, with separate results per league in Online Appendix Tables 16–22, are in line with the meta-analysis results reported above. Particularly, the Wald tests indicate statistically significant (suggestive) evidence of heterogeneity in eight (four) of the twelve regression analyses.

Table 2.

R² and Wald Test Results of Joint Significance of Team Fixed Effects in Game-Level Ordinary Least Squares Regressions of the Four Overperformance Measures for Each Season, Pooled Across All Leagues (Preregistered Robustness Test).

(a) Season 2021/22
Measure	GP	PP	OP	DP
R²	0.035 [0.025, 0.045]	0.031 [0.022, 0.040]	0.035 [0.025, 0.045]	0.036 [0.026, 0.046]
χ²_df -test	χ²₁₃₃ = 174.9 p = 0.009	χ²₁₃₃ = 197.3 p < 0.001	χ²₁₃₃ = 179.3 p = 0.005	χ²₁₃₃ = 182.8 p = 0.003
(b) Season 2022/23
Measure	GP	PP	OP	DP
R²	0.037 [0.027, 0.047]	0.044 [0.033, 0.056]	0.034 [0.024, 0.044]	0.035 [0.025, 0.045]
χ²_df -test	χ²₁₃₃ = 191.8 p = 0.001	χ²₁₃₃ = 289.4 p < 0.001	χ²₁₃₃ = 176.0 p = 0.007	χ²₁₃₃ = 190.4 p = 0.001
(c) Season 2023/24
Measure	GP	PP	OP	DP
R²	0.042 [0.031, 0.053]	0.050 [0.038, 0.062]	0.042 [0.031, 0.053]	0.035 [0.025, 0.045]
χ²_df -test	χ²₁₃₁ = 199.4 p < 0.001	χ²₁₃₁ = 342.0 p < 0.001	χ²₁₃₁ = 215.0 p < 0.001	χ²₁₃₁ = 172.5 p = 0.009

Notes: The table shows the results of the Wald test, testing if team fixed effects are jointly significant and the R² (i.e., the fraction of the variance explained by team fixed effects) alongside its 95% confidence interval (reported in brackets) for goals overperformance (GP), points overperformance (PP), offensive overperformance (OP), and defensive overperformance (DP) for each of the three seasons pooled across the seven top leagues in Europe: Premier League (ENG), La Liga (ESP), Ligue 1 (FRA), Bundesliga (GER), Serie A (ITA), Eredivisie (NLD), and Liga Portugal (PRT).

The R² in the robustness tests of the primary and secondary hypotheses across seasons ranges from 3.5% to 4.2% for goals overperformance, from 3.1% to 5.0% for points overperformance, from 3.4% to 4.2% for offensive overperformance, and from 3.5% to 3.6% for defensive overperformance. Note that the R² in the robustness analyses captures the fraction of the variance per game explained by heterogeneity, whereas I² in the random-effects meta-analyses quantifies the fraction of the variance per team per season. Consequently, R² and I² estimates cannot be directly compared with one another.

Simulations Illustrating the Impact of Luck on League Positions (Not Preregistered)

We explore the impact luck has on the end-of-season table ranking of teams using simulations. This exercise and the corresponding results are exploratory and should be interpreted cautiously due to the absence of preregistration. As the impact of luck will depend on the association between expected points and skills (where skills are defined as the systematic variation in points overperformance between teams), we start by estimating random-effects meta-regressions testing if the teams’ expected points moderate over- and underperformance. Particularly, we conduct meta-regressions of the mean points overperformance on the mean team-level expected points separately for each of the three seasons; the results are reported in Table 3. We find a statistically significant association between skills and performance relative to expectations in all three seasons, with an average coefficient of 0.393, implying that a one-point increase in expected points is associated with an increase of 0.393 points per game (translating into 14.9 points per season with 38 games). This suggests that teams that are expected to receive more points are more likely to overperform, indicating that teams that are more skilled overall, as measured by expected points, are also more adept at converting shots into goals and preventing opponents from converting their shots. As these tests were not preregistered, these results need to be confirmed in future confirmatory analyses to carry more weight.

Table 3.

Meta-Regressions of Points Overperformance on Expected Points per Season.

Season	2021/22	2022/23	2023/24
Intercept	−0.438** (0.059) [−0.554, −0.323]	−0.500** (0.076) [−0.648, −0.352]	−0.765** (0.072) [−0.905, −0.625]
Exp. points (xPTS)	0.300** (0.041) [0.220, 0.384]	0.357** (0.053) [0.220, 0.384]	0.522** (0.049) [0.426, 0.619]
τ	0.000 [0.000, 0.079]	0.110 [0.053, 0.152]	0.088 [0.000, 0.123]
I²	0.000 [0.000, 0.169]	0.284 [0.083, 0.430]	0.196 [0.000, 0.323]
Q_df -test	Q₁₃₂ = 123.7 p = 0.685	Q₁₃₂ = 181.5 p = 0.003	Q₁₃₂ = 152.6 p = 0.085

Notes: The table shows the results of team-level random-effects meta-regressions (using the restricted maximum likelihood estimator for τ²) of points overperformance (PP) on teams' expected points (xPTS) for the three seasons 2021/22, 2022/23, and 2023/24, alongside estimates of the residual heterogeneity (τ, I², and Cochran's Q-test), pooled across the seven top leagues in Europe: Premier League (ENG), La Liga (ESP), Ligue 1 (FRA), Bundesliga (GER), Serie A (ITA), Eredivisie (NLD), and Liga Portugal (PRT). Standard errors are reported in parentheses; 95% confidence intervals are reported in brackets.

* p < 0.05, ** p < 0.005.

To illustrate how variability attributable to luck translates into variability in end-of-season league rankings, we engage in a simple simulation exercise, which we carry out separately for each league and season (as described in more detail in the Materials and Methods section above). The results are illustrated in Figure 5. Across leagues and seasons, the average change in league positions due to luck varies between 1.389 and 2.359, with an average of 1.781, and the likelihood that the winning team is altered due to chance varies between 0.4% and 74.4% (m = 33.9%). The impact of luck changes the list of top teams qualifying for the Champions League with a probability ranging from 1.6% to 92.0% (m = 56.0%), while the set of teams facing relegation is altered with a probability varying from 35.8% to 96.5% (m = 75.9%).

Figure 5.

Simulation Results (Not Preregistered).

Discussion

We tested two primary hypotheses about the presence of heterogeneity in goals overperformance and points overperformance across seven football leagues in Europe for three seasons, hypothesizing that part of the variability in deviations from the expected performance is attributable to skills. We reported statistically significant evidence of heterogeneity in five of the tests, while suggestive evidence was noted in the sixth test. This suggests that there are moderators of teams’ overperformance, which are not accounted for in performance predictions. In addition, two secondary hypothesis tests provided support for the existence of systematic variability above and beyond chance in offensive and defensive overperformance. Therefore, we found strong support for heterogeneity in over- and underperformance, which is indicative of skills explaining part of the variability in teams’ performance relative to expectations in football.

Regarding the magnitude of heterogeneity, our point estimates indicate that skills account for approximately 40% of the variation in over- and underperformance, while luck accounts for roughly 60% of the deviations from expected performance. Over the course of three seasons (consisting of 38 matches each), the average estimated standard deviation in over- and underperformance due to skills (τ) is approximately 6 points per season, while the average standard deviation due to luck (σ) is about 7 points per season. These estimates suggest that the ability to perform better than expected can significantly impact a team's final position in the league, especially for teams that are closely ranked in the league position table. Additionally, our analysis suggests that luck also plays a crucial role in determining league positions. Simulation results show that, on average, luck can affect a team's rank by 1.781 positions and has a 56.0% probability of affecting the top four positions, securing qualification for the Champions League.

There are some important caveats regarding our results. We focused solely on luck as an explanation for teams’ performance relative to expectations instrumentalized in terms of expected goals (xG) and expected points (xPTS). However, it is likely that luck also plays a role in determining the number of shots in a game, which is not incorporated in our results. The variability in our outcome measures will also depend on the predictive ability of the expected goals data used. Our results indicate significant heterogeneity, suggesting that the predictive accuracy of the xG data used is decent but not optimal as a performance indicator. As these predictions become more sophisticated, the variation in outcome measures between teams may decrease. The expected goals data used in this study is based on the position of the shots as well as the location of defenders and goalkeepers. However, if additional variables associated with the efficiency of converting shots into goals were modeled, the predictions may capture part of the variation in over- and underperformance that we attributed to skills. This expected decrease in heterogeneity attributable to skills suggests that our estimates of the significance of skills in these outcome measures set an upper bound on the maximum additional variation that more sophisticated prediction models can explain. One additional variable that could enhance the predictive accuracy of expected goals data is modeling the identity of the player with the goal-scoring opportunity, allowing for the inclusion of differences in player-specific skills in converting shots into goals. The lack of publicly available game-level data on expected goals from different providers complicates the process of testing for differences between different estimations of expected goals and points. However, we analyzed the correlation between expected goals and points at the season level for the 2023/24 season across the top five leagues using data from Understat (https://understat.com/), the data provider referenced in the study by Sarkar and Kamath (Sarkar & Kamath, 2023). We manually sourced the data from the league tables featuring expected goals and points from the Understat website, noting that data for Eredivisie (NLD) and Liga Portugal (PRT) are not available. The Pearson correlations for expected goals ranged from 0.966 to 0.983 across the five leagues, and for expected points, the correlations varied between 0.962 and 0.991. It is important to note that the luck component of the variation in overperformance—defined as the fraction of the variability between teams not explained by heterogeneity—may depend on omitted variables, model limitations, or systematic dispersion that xG models do not account for, which could potentially result in an overestimation of the luck component.

Applying our methodology to more leagues and seasons in future research could enhance the precision of estimates of the variability attributable to luck and skills. Additionally, it opens up opportunities to investigate whether the skills component of performance relative to expectations is related to specific team characteristics or individual players. Our methodology can also be adapted to analyze individual players instead of teams to determine the extent to which deviations from expected goals for players are explained by skills or luck.

Supplemental Material

sj-pdf-1-jse-10.1177_15270025251374620 - Supplemental material for Skills vs. Luck: Decomposing Deviations from Expected Performance in European Football Leagues

Supplemental material, sj-pdf-1-jse-10.1177_15270025251374620 for Skills vs. Luck: Decomposing Deviations from Expected Performance in European Football Leagues by Felix Holzmeister and Magnus Johannesson in Journal of Sports Economics

Footnotes

ORCID iDs

Felix Holzmeister

Magnus Johannesson

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Supplemental Material

Supplemental material for this article is available online.

Notes

Author Biographies

Felix Holzmeister is an Assistant Professor of Behavioral and Experimental Economics and Finance. His research centers around metascience, research methodology, and behavioral economics.

Magnus Johannesson is a Professor of Economics. His area of research is metascience, experimental economics, and genoeconomics.

References

Anzer

Bauer

(2021). A goal scoring probability model for shots based on synchronized positional and event data in football (soccer). Frontiers in Sports and Active Living, 3. https://doi.org/10/g9pwvp

Baron

Sandholtz

Pleuler

Chan

T. C. Y.

(2024). Miss it like Messi: Extracting value from off-target shots in soccer. Journal of Quantitative Analysis in Sports, 20(1), 37–50. https://doi.org/10/g9r93p

Bartling

Brandes

Schunk

(2015). Expectations as reference points: Field evidence from professional soccer. Management Science, 61(11), 2646–2661. https://doi.org/10/gf6776

Ben-Naim

Vazquez

Redner

(2006). Parity and predictability of competitions. Journal of Quantitative Analysis in Sports, 2(4). https://doi.org/10/drd2fz

Benjamin

D. J.

Berger

J. O.

Johannesson

Nosek

B. A.

Wagenmakers

E.-J.

Berk

Bollen

K. A.

Brembs

Brown

Camerer

Cesarini

Chambers

C. D.

Clyde

Cook

T. D.

De Boeck

Dienes

Dreber

Easwaran

K.,

Efferson

, … Johnson

V. E.

(2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6–10. https://doi.org/10/cff2

Berry

C. R.

Fowler

(2021). Leadership or luck? Randomization inference for leader effects in politics, business, and sports. Science Advances, 7(4), eabe3404. https://doi.org/10/gq42xh

Borenstein

Hedges

L. V.

Higgins

J. P. T.

Rothstein

H. R.

(2010). A basic introduction to fixed-effect and random-effects models for meta-analysis. Research Synthesis Methods, 1(2), 97–111. https://doi.org/10/dz75pm

Brechot

Flepp

(2020). Dealing with randomness in match outcomes: How to rethink performance evaluation in European club football using expected goals. Journal of Sports Economics, 21(4), 335–362. https://doi.org/10/gjsc8z

Chiappori

P.-A.

Levitt

Groseclose

(2002). Testing mixed-strategy equilibria when players are heterogeneous: The case of penalty kicks in soccer. American Economic Review, 92(4), 1138–1151. https://doi.org/10/fqt7mc

10.

Cohen

West

S. G.

Aiken

L. S.

(2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd Aufl.). Lawrence Erlbaum Associates Publishers.

11.

Cowan

J. L.

(1969). The gambler’s fallacy. Philosophy and Phenomenological Research, 30(2), 238–251. https://doi.org/10/cthmbf

12.

Deloitte Sports Business Group . (2024). Football division: Annual review of football finance 2024. Deloitte. https://bit.ly/4f0UAyv

13.

Dmochowski

J. P.

(2023). A statistical theory of optimal decision-making in sports betting. PLoS One, 18(6), e0287601. https://doi.org/10/gsd32b

14.

Emerson

J. W.

Seltzer

Lin

(2009). Assessing judging bias: An example from the 2000 Olympic games. The American Statistician, 63(2), 124–131. https://doi.org/10/dszjh8

15.

Fioravanti

Delbianco

Tohmé

(2023). The relative importance of ability, luck and motivation in team sports: A Bayesian model of performance in the English Rugby Premiership. Statistical Methods & Applications, 32(3), 715–731. https://doi.org/10/g9pwb2

16.

Gauriot

Page

(2019). Fooled by performance randomness: Overrewarding luck. The Review of Economics and Statistics, 101(4), 658–666. https://doi.org/10/gn896x

17.

Gilbert

D. E.

Wells

M. T.

(2019). Ludometrics: Luck, and how to measure it. Journal of Quantitative Analysis in Sports, 15(3), 225–237. https://doi.org/10/g9pwbw

18.

Gilovich

Vallone

Tversky

(1985). The hot hand in basketball: On the misperception of random sequences. Cognitive Psychology, 17(3), 295–314. https://doi.org/10/bfdfhd

19.

Heuer

Müller

Rubner

(2010). Soccer: Is scoring goals a predictable Poissonian process? Europhysics Letters, 89(3), 38007. https://doi.org/10/cbp8nm

20.

Heuer

Rubner

(2009). Fitness, chance, and myths: An objective view on soccer results. The European Physical Journal B, 67(3), 445–458. https://doi.org/10/bxsgh9

21.

Heuer

Rubner

(2012). How does the past of a soccer match influence its future? Concepts and statistical analysis. PLoS One, 7(11), e47678. https://doi.org/10/g9pwvq

22.

Higgins

J. P. T.

Thompson

S. G.

Spiegelhalter

D. J.

(2009). A re-evaluation of random-effects meta-analysis. Journal of the Royal Statistical Society Series A: Statistics in Society, 172(1), 137–159. https://doi.org/10/dmthxv

23.

John

L. K.

Loewenstein

Prelec

(2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. https://doi.org/10/f33h6z

24.

Klemp

Wunderlich

Memmert

(2021). In-play forecasting in football using event and positional data. Scientific Reports, 11(1), 24139. https://doi.org/10/g9pwvs

25.

Lames

(2018). Chance involvement in goal scoring in football – an empirical approach. German Journal of Exercise and Sport Research, 48(2), 278–286. https://doi.org/10/g9pwvv

26.

Levitt

S. D.

Miles

T. J.

(2014). The role of skill versus luck in poker: Evidence from the world series of poker. Journal of Sports Economics, 15(1), 31–44. https://doi.org/10/f5np9v

27.

Mauboussin

M. J.

(2012). The success equation: Untangling skill and luck in business, sports, and investing. Harvard Business Publishing.

28.

Miller

J. B.

Sanjurjo

(2018). Surprised by the hot hand fallacy? A truth in the law of small numbers. Econometrica, 86(6), 2019–2047. https://doi.org/10/ghdczh

29.

Neale

W. C.

(1964). The peculiar economics of professional sports. The Quarterly Journal of Economics, 78(1), 1–14. https://doi.org/10/dcz6p7

30.

Nelson

L. D.

Simmons

Simonsohn

(2018). Psychology’s renaissance. Annual Review of Psychology, 69, 511–534. https://doi.org/10/gfgt65

31.

Nosek

B. A.

Ebersole

C. R.

DeHaven

A. C.

Mellor

D. T.

(2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 2600–2606. https://doi.org/10/gc6xk8

32.

Olkin

Finn

J. D.

(1995). Correlations redux. Psychological Bulletin, 118(1), 155–164. https://doi.org/10/bt48vd

33.

Parsons

C. A.

Sulaeman

Yates

M. C.

Hamermesh

D. S.

(2011). Strike three: Discrimination, incentives, and evaluation. American Economic Review, 101(4), 1410–1435. https://doi.org/10/fw6znt

34.

Pawlowski

Anders

(2012). Stadium attendance in German professional football – the (un)importance of uncertainty of outcome reconsidered. Applied Economics Letters, 19(16), 1553–1556. https://doi.org/10/g9pwbq

35.

Price

Wolfers

(2010). Racial discrimination among NBA referees. The Quarterly Journal of Economics, 125(4), 1859–1887. https://doi.org/10/dk4jqh

36.

R Core Team . (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

37.

Rathke

(2017). An examination of expected goals and shot efficiency in soccer. Journal of Human Sport and Exercise, 12(Proc2). https://doi.org/10/g9pwvm

38.

Riley

R. D.

Higgins

J. P. T.

Deeks

J. J.

(2011). Interpretation of random effects meta-analyses. BMJ, 342, d549. https://doi.org/10/bm3s6c

39.

Riot

Kennelly

Hill

Trenberth

(2018). The sport business industry in the twenty-first century. In Hassan

(Ed.), Managing Sport Business (2. Aufl.). Routledge. https://doi.org/10/nm2s

40.

Robberechts

Davis

(2020). How data availability affects the ability to learn good xG models. In Brefeld

Davis

Van Haaren

Zimmermann

(Eds.), Machine learning and data mining for sports analytics (pp. 17–27). Springer. https://doi.org/10/g9pwvn

41.

Rottenberg

(1956). The baseball players’ labor market. Journal of Political Economy, 64(3), 242–258. https://doi.org/10/bnnjbn

42.

Sandberg

(2018). Competing identities: A field study of in-group bias among professional evaluators. The Economic Journal, 128(613), 2131–2159. https://doi.org/10/gd8wgt

43.

Sarkar

Kamath

(2023). Does luck play a role in the determination of the rank positions in football leagues? A study of Europe’s ‘big five’. Annals of Operations Research, 325(1), 245–260. https://doi.org/10/nmt6

44.

Sheskin

D. J.

(2011). Handbook of parametric and nonparametric Statistical procedures (5. Aufl.). Chapman and Hall, CRC Press.

45.

Simmons

J. P.

Nelson

L. D.

Simonsohn

(2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10/bxbw3c

46.

Skinner

G. K.

Freeman

G. H.

(2009). Soccer matches as experiments: How often does the ‘best’ team win? Journal of Applied Statistics. https://doi.org/10/d45q8q

47.

Sobkowicz

Frank

R. H.

Biondo

A. E.

Pluchino

Rapisarda

(2020). Inequalities, chance and success in sport competitions: Simulations vs empirical data. Physica A: Statistical Mechanics and its Applications, 557, 124899. https://doi.org/10/nmt7

48.

Spann

Skiera

(2009). Sports forecasting: A comparison of the forecast accuracy of prediction markets, betting odds and tipsters. Journal of Forecasting, 28(1), 55–72. https://doi.org/10/c9vqk2

49.

Sundali

Croson

(2006). Biases in casino betting: The hot hand and the Gambler’sfallacy. Judgment and Decision Making, 1(1), 1–12. https://doi.org/10/g9pzf9

50.

Szymanski

(2000). A market test for discrimination in the English professional soccer leagues. Journal of Political Economy, 108(3), 590–603. https://doi.org/10/dk9vfg

51.

Tversky

Kahneman

(1971). Belief in the law of small numbers. Psychological Bulletin, 76(2), 105–110. https://doi.org/10/cqwr5s

52.

Tversky

Kahneman

(1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. https://doi.org/10/gwh

53.

Viechtbauer

(2005). Bias and efficiency of meta-analytic variance estimators in the random-effects model. Journal of Educational and Behavioral Statistics, 30(3), 261–293. https://doi.org/10/fhd39t

54.

Viechtbauer

(2007). Confidence intervals for the amount of heterogeneity in meta-analysis. Statistics in Medicine, 26(1), 37–52. https://doi.org/10/cp37bj

55.

Viechtbauer

(2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36, 1–48. https://doi.org/10/gckfpj

56.

Wunderlich

(2024). Using the wisdom of crowds in sports: How performance analysis in football can benefit from the information enclosed in betting odds. International Journal of Performance Analysis in Sport, 1–20. https://doi.org/10/g9pwvt

57.

Wunderlich

Memmert

(2018). The betting odds rating system: Using soccer forecasts to forecast soccer. PLoS One, 13(6), e0198668. https://doi.org/10/gdnq3q

58.

Wunderlich

Seck

Memmert

(2021). The influence of randomness on goals in football decreases over time. An empirical analysis of randomness involved in goal scoring in the English Premier League. Journal of Sports Sciences, 39(20), 2322–2337. https://doi.org/10/gj592n

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.40 MB