Abstract
Most studies exploring the global market of association football (soccer) transfers have relied on the hedonic pricing approach, where the characteristics of the player are the key features. While this has largely been useful, there are some potential challenges since it does not represent the global market and is susceptible to selection bias. In this exploratory study, we aimed to address these two key issues by proposing a different modelling approach that relies on an original solution (i.e., video gaming data). First, we built a random global sample by starting from an existing player universe, and appended video gaming data, transfer and salary data. Second, we developed new measures of transfer prices (without selectivity). This design is superior as it addresses the challenges of non-representativeness and selection bias. Indeed, the results are homoscedastic across different segments (regions and player positions) and provide results that are robust and representative of the global market than previous studies.
Introduction
Every summer and winter, a multitude of association football (soccer) fans, analysts, media, and researchers eagerly await the rumours and indicators of possible new signings that could surpass the preceding transfer window announcements. The Mercato has become an amusing festival for the stakeholders because it engages big names and significant amounts of money, boosting trade wars for talent acquisition among clubs. Such wars have fuelled speculation and affected the valuation of transfer fees and salaries of top football players among top football clubs.
Media coverage, including social media, has also played a crucial role in reshaping the financial football model, which has been evolving over the years, influencing top management's decisions and fanbases’ reactions to certain transfers. In this context, the French giant Paris Saint-Germain spent €220 m to lure the services of FC Barcelona star Neymar da Silva Santos Júnior in the summer of 2017, the highest ever amount paid for gaining the services of a football player. An all-time high, historical record, sale forced by meeting the player's release clause has led to a panic-buying reaction by the Spanish giant FC Barcelona, triggering a spending spree of €145 m and €144 m for the purchases of Ousmane Dembele and Philippe Coutinho.
Figure 1 reveals the growing expenditure in transfer fees by the top five European men's football leagues (England, France, Germany, Italy and Spain), the so-called ‘Big Five’, until 2019/2020, i.e., before Covid-19. The buying club usually pays an estimate of the transferred player's market worth. Thus, the issue is to identify key elements affecting football player pricing. Football clubs are struggling financially due to various reasons, including negative demand and productivity shocks, relegations and more recently the COVID-19 pandemic, with the potential consequence being insolvencies (Scelles et al., 2018; Szymanski, 2017; Szymanski & Weimar, 2019).

Growth of expenditure in transfer fees by the top five European men's football leagues. This FIFA data is used for a broader context on transfer fee growth. Source: FIFA (2019).
Understanding the player pricing function's determinants is crucial to the financial management of football clubs. In this respect, video gaming data 1 can be very helpful. First, several advancements in sports technology have made video game data more reflective of real-life player performance, thus providing a novel avenue for estimating player values more accurately than traditional methods, which often rely solely on historical data or subjective assessments. Second, with the recent trends where clubs face significant financial pressures to make informed decisions in player acquisitions, determining the extent to which transfer values can be predicted by combining video game characteristics with real-life statistics is particularly useful. Indeed, many models using the hedonic pricing of football transfers were influenced by several such studies, see, for instance, Payyappalli and Zhuang (2019). Third, pricing football transfers (while most associated with football) play a significant role in various sports and having a broader view of its predictors can thus be useful in further developing the broader economic landscape for professional athletes. Indeed, Franceschi et al. (2024) found that transfer fees were affected by the football player's age, height, goals, assists, and appearance, the timing of the transfer, and the selling and buying clubs’ size and prestige. They also found that these parameters vary significantly among market segments (i.e., significant heterogeneity occurs) by geography (regions or nations), sports level (leagues), or in-game positions.
However, these studies and their approaches present two main issues. First, they relied on small samples and could not be deemed representative of the global market. Second, they were susceptible to selection bias because they valued the transfer prices from a set of transferring players. These econometric issues highlight the need for research based on larger samples representative of the global market and addressing the selectivity problem.
Most studies used independent variables like the players’ characteristics, their competitive record and fame, and some characteristics of the contract between them and the club. One may argue that these “real” performance variables are available not only for players who are transferred but also for those who are not transferred. However, their availability is not straightforward when it comes to leagues of lower standard. Therefore, using them may not allow researchers to build a sample representative of the global market. To address this issue, this paper relied on video gaming data which represents players’ skills given by experts. The reason for using such type of data is that every player, transferred or not, is evaluated by experts globally and therefore we were able to include a more comprehensive subset of players in this study. The rationale behind this strategy was to contribute a solution to the lack of representativeness and selection bias problem, which has not been addressed properly in previous studies. The results show that valuation models can be consistent across time or space, while the difficulty of tackling selection bias and heteroscedasticity in a global model using the transfer fee alone was solved by aggregating various elements of players’ cost in one overall package. Such aggregation generated promising results and findings. The rest of the paper is organized as follows. Section 2 reviews the literature. Section 3 presents the data, some econometric issues and the estimation strategy. This is followed by an analysis of the results (Section 5) before Section 6 provides some concluding remarks.
Literature review
The literature on football player transfer pricing is both historically deep and methodologically diverse, reflecting the growing complexity and commercial importance of player mobility in the sport. Early work, most notably by Carmichael and Thomas (1993), laid a theoretical foundation for the subject by leveraging Nash bargaining theory to analyze how bargaining power, player attributes (such as age, appearances, and goal records), and positional factors affect transfer fees. Their research underscored that transfer pricing is fundamentally shaped by the negotiation context as well as by quantifiable features of both players and clubs.
Subsequently, Dobson and Gerrard (1999) advanced the field by introducing more nuanced market segmentation to account for geography and playing positions, enriching analytical frameworks with variables like international caps and differentiating between buyer and seller club motives. This segmentation made it clear that transfer fees arise from a combination of individual talent, strategic club behavior, and prevailing market conditions.
A key trend in the modern literature has been the incorporation of new factors reflecting the sport's commercialization. Recent studies consider media presence, player popularity, and community-driven valuations, exemplified by the use of crowdsourced data from Transfermarkt or Google search counts, as influential determinants alongside traditional performance metrics. Herm et al. (2014) is notable for analyzing the market's “branding” dimension, finding that both media visibility and online community valuations can meaningfully predict transfer prices. Barbuscak (2018) used multivariate regression models to demonstrate that variables such as contract duration, popularity, and market value, as determined by highly engaged fans, frequently hold as much relevance as purely athletic factors. These insights introduce a layer of subjectivity into the market, suggesting that the perception and media representation of a player can be nearly as valuable as their on-field performance.
The role of bargaining and negotiation has been identified as central. Speight and Thomas (1997), for example, studied how negotiated transfer fees compare with arbitration settlements, identifying systematic differences in bargaining outcomes where selling clubs often received less via arbitration, a pattern reinforcing the strategic value of negotiations in determining transfer fees. Meanwhile, Dobson et al. (2000) further validated these approaches across non-league contexts, affirming the general applicability of early theoretical models.
In terms of methodology, initial studies often relied on ordinary least squares (OLS) regression, but more recent and sophisticated research has increasingly adopted the Heckman two-step approach to correct for selection bias, a recognition that not all professional players have an equal likelihood of being transferred. Although early adoption was evident in Carmichael et al. (1999), this methodology is now more widely used, evidenced by further contributions from Depken and Globan (2021), and others. These studies first estimate the probability of a player being transferred, then use this insight to adjust the transfer fee equation, mitigating the bias introduced when analyses focus solely on transferred players.
The importance of meticulously accounting for the characteristics of both buying and selling clubs has become a consensus in the literature, with research noting the elevated bargaining power of successful selling clubs (Carmichael and Thomas, 1993) and the positive price premium for contracts with longer duration or expiring later in time. At the same time, there is a persistent need for richer models that incorporate direct and position-specific measures of player quality. While goals and appearances are commonly included, more granular or nuanced statistics (such as those measuring defensive or creative contributions) are increasingly recognized as important for accurately capturing player value, given positional biases in traditional models.
Large-scale systematic reviews, such as Franceschi et al. (2024), have addressed data limitations and sample selection issues by reviewing dozens of studies and thousands of player transactions, revealing that the literature's empirical scope is much broader than previously characterized. Criticisms regarding limited coverage or sample size are thus being addressed as transfer market data becomes more robustly available, especially for top European leagues. Similarly, the influence of “superstar effects” and market segmentation remains an active research area, with newer studies applying advanced econometric and machine learning techniques to parse out the unique attributes of high-value transfers and to predict transfer outcomes across diverse segments of the player population.
Recent methodological advances reflect a trend towards increasingly data-driven and precise analysis. The International Centre for Sport Studies (CIES) and scholars like McHale and Holmes (2023) now leverage massive datasets from major football leagues, integrating variables ranging from granular performance data to contract specifics, team performance, and digital followership. These efforts yield models that not only predict transfer fees with greater accuracy but also unearth novel predictors absent in earlier frameworks.
Despite the progress, unresolved issues persist, especially concerning the direct quantification of non-performance factors (such as marketability, media exposure, and fan following) and how these interact with established determinants like player productivity and contractual complexity. The literature continues to grapple with documenting the full extent of transfer market segmentation and with refining theoretical explanations for observed price disparities among otherwise comparable players.
In sum, research on football player transfer pricing demonstrates a mature, evolving field, characterized by theoretical innovation, methodological rigor, and increasing empirical breadth. Contemporary studies not only confirm the importance of performance and contractual factors but also recognize the growing influence of brand value, media presence, and digital engagement. Ongoing debates about superstar effects, segmentation, and fairness speak to the dynamic interplay of economic, social, and cultural forces shaping the modern football transfer market.
Methodology
Data
In this paper, we address the challenges associated with the non-representativeness and selection bias of previous studies that used the hedonic pricing approach to estimate football prices. To solve these puzzles, we need not start with a set of transferring players but with a somewhat representative set of players, to which we will later append further data such as transfer prices (if any) and salaries. The FIFA games series by EA Sports (the series having been now replaced by EA Sports FC) or Pro Evolution Soccer by Konami on PlayStation and other platforms provide a large set of players. Player information is compiled in a so-called “Futhead” database, which is available on fan pages around the internet: it contains tens of thousands of players rated by experts for their skills. They provide data for all players, not just players who transfer. While we acknowledge the limitations of the “Futhead” database such as expertise of raters, consistency in evaluations, update frequency, and potential biases, we selected it due to the absence of any substantive alternative. Such data have already been used in recent research, see e.g., Yan (2020) on the evaluation of performance and market value of football players in the top five European men's football leagues, Behravan and Razavi (2021) on the estimation of football players’ value in the transfer market through machine learning, Richau et al. (2021) on the impact of investors and ownership structures on transfer fees in the English Premier League, as well as Coates and Parshakov (2022) on the determinants of the actual fees paid in a transfer of a player. The use of these data is consistent with and strengthens the more general growing relationship between sports and video games or esports, showing the mutual benefits between both (García and Murillo, 2020; Pizzo et al., 2022; Scelles et al., 2021). The data was then supplemented by transfer data between 2007/2008 and 2018/2019 and salaries between 2012/2013 and 2018/2019, all harvested on the internet, notably from Transfermarkt (similar to Muller et al. (2017) mentioned above or, more recently, Felipe et al. (2020) on their study of the team variables and player positions that most influence the market value of professional male footballers in the top five European men's football leagues) and Sofifa (similar to other authors such as Payyappalli and Zhuang (2019) in their data-driven integer programming model for football clubs’ decision making on player transfers) 2 .
From these transfer data, it appears that the transfer fees are not normally distributed. Even when zero transfer fees and loans are taken out of the sample, the distribution does not appear to be normal, nor are the logarithms or any simple transformation of the transfer fees. This has an important implication since it means the Heckman correction, which rests on a normal dependent variable, could not be applied to correct for possible selection.
Most variables from the database are usual (see the complete list in Table 1), except the number of days remaining in contracts, which we could not obtain for every player. However, it has not been extensively tested so far to value transfer contracts – exceptions being articles published recently by Coates and Parshakov (2022), Garcia-del-Barrio and Pujol (2020), Richau et al. (2021) and Rubio Martin et al. (2022) – and was thus worth trying even if it reduced the number of observations greatly. Moreover, age is measured on both sides of the age of peak transfer price: lagem and lagem2 measure the years below 24 years while lagep and lagep2 count the years above to test for possible asymmetry of an age effect. Eventually, six skills are singled out, two for each group of positions (forward players = pace + shooting, midfielders = dribbling + passing, defenders = defending + physicality).
Variables categories and groups.
Notes: CL: Champions League, App.: Appearance. The “–2” column refers to lagged (previous two seasons) values of each skill or performance metric. The two for each group of positions” means two core skill metrics are selected for forwards, midfielders, and defenders respectively.
Then, the measurement of club transfer activity must be explained. While several global rankings of clubs do exist, they do not contain all the clubs we have in the database. We thus had to develop an endogenous measure. The idea was to count clubs especially active on the transfer market by counting the number of transfer contracts in the database: co_cuclu and co_preclu provide such a count for the current (i.e., buying) and previous (i.e., selling) clubs. The magnitude of transactions is recorded by tot_cuclu and tot_preclu which sum up the total value of transfers (in the database) for the current (i.e., buying) and previous (i.e., selling) clubs. Since the database only spans seven seasons, the hierarchy of clubs does not move much; it would be appropriate to consider those measures in a moving time window.
Eventually, we could not append the full set of information to every player in the Futhead database since only 14,051 observations of salary (Table 2) and 13,500 observations of contract duration were gathered, the intersection being around 8000. Nevertheless, the database may be considered to retain the same properties as the starting universe (i.e., the Futhead database). Our sample is larger 3 than all previous research and internationally diversified, so we can inquire if transfer charge pricing is worldwide or segmented. It is worth mentioning this global approach to the subject has never been tried in the literature except by Coates and Parshakov (2022). By contrast to these authors, we not only analyse our data worldwide with all player positions aggregated, but also per region as well as per position and region to identify any potential differences across these segments.
Unique salary observations in our database per country 2007–2019.
The salary observations in Table 2 are direct estimates of annual base salary awarded to each player via club contract, gross of tax and excluding bonuses or external income. This data enables our modeling of the relationship between individual and team performance metrics and actual player compensation, providing an accurate reflection of the wage-setting processes in elite football. Transfer fees and market values, often speculative and unrelated to individual pay, are not used in this analysis.
Accordingly, we built a random global sample by starting from an existing player universe and appending data gathered over the internet. Second, we developed new measures of transfer prices which are not susceptible to being censored (i.e., without selectivity) and reflect the expectations of the stakeholders through the inclusion of the current and expected cost of the player in addition to the transfer fee. The overall design is demanding in terms of information quality and quantity, but the modelling of the overall transfer cost is better than with the transfer fee alone. Thus, such a global model, including 22 countries and 25 leagues, can satisfactorily explain the expenditure by buying clubs from 2007 to 2018 on players’ transfers and wages (see Table 3 for the balance of trade in these 25 leagues). We achieved homoskedasticity on segments of the global market (regions and player positions), proven to be consistent by a series of Chow tests. This might illustrate how the global market for transfers might be adequately studied using this methodology on an even larger scale, as more data become available over time, complementing regional markets. Appendix 2 presents our descriptive statistics.
Balance of trade in the competitions studied (2007–2020).
Source: Transfermarkt.
Getting around selection issues
While most studies focus on transfer fees and, more recently, transfer market value (Franceschi et al., 2024), which is outside our scope, it might be worth refining the analysis before we decide on a dependent variable to be explained by the hedonic analysis. A transfer is a bargain between three sides: a buying club paying a selling club a transfer fee and to a player some future (certain) salaries and (uncertain) bonuses (one may add the agents, but we do not have figures about their earnings and assume they perceive a percentage of the other payments). From the player's perspective, he may want to extract the maximum out of the various clubs he is going to play with; hence, his program is to maximise the sum of his incoming cash flows:
From an objective point of view, the club is willing to minimise the cost incurred when hiring the player, and this cost breaks down into a transfer fee plus an agreed-on salary for the duration of the contract, and some additional contingent costs such as bonuses, which are not known on the day the contract is signed (but the list of events triggering bonuses may be in the contract):
These quantities may be approximated by (in order of greater complexity):
as an approximation for the whole
The advantage over the raw transfer fee is not just adding some marginal information for transferring players: taking salaries into consideration guarantees that the dependent variable is not censored. For better adequacy, the remaining contract duration should be considered. To ensure the model is meaningful, we also included in the database players on loan and players whose contracts have ended, who should thus transfer for free. Those latter players help test the consistency of estimations provided by the model, since strictly speaking, a player with six months remaining in the contract should not have a package very different from a player with a contract that just ended, albeit the distribution between the club and the player may differ significantly. Our model is not suitable to analyse this effect, though.
where PV stands for potential value.
This package features undisclosed elements (such as the contingent payment scheme) and a double uncertainty, both on the realization of the contingent events and on what will happen beyond the horizon of the contract. We can think of all those elements to be conditional to the current salary, and it is not unreasonable to think that future salaries can be expected to vary according to the cross-sectional variation of salaries in the base. That is to say, when a player's age grows by one unit, his salary is adjusted according to the average variation for players of his age and the probability that he remains a professional player Is given by the average probability of players of his age. Those salaries and probabilities certainly do not evolve uniformly across the spectrum of all players (Figure 2), but this coarse approximation of evolution patterns is a starting point to compute this subjective package as:

Average salaries per age per season.
Where
Eventually, SCP can be written as:
Eventually, the player may be interested in the transfer fee as a signal of the willingness of the club to pay, but he is likely concerned only by what he will take from his club; hence we can define:
We assumed as with the SCP that the future salaries are dependent on the current relative salary and the average evolution in the database hence:
It seems pretty obvious that the complete specification of the subjective package is well beyond our current knowledge of the stakeholders: we do not have data on the risk preferences of football players, nor on the interest rate players and clubs consider discounting future opportunities. We had to make a series of assumptions to discuss the general idea that we want to capture a non-linear relationship between transfer fee, age and current salary. We assumed players to be somewhat risk-neutral and adopt a zero-discount rate.
It is quite obvious that the aforementioned packages do not provide a direct estimate for the transfer fee. The transfer fee can be computed very simply, though, from the packages predicted by the models, since:
Eventually, we have four dependent variables to try to value, and three of them are not censored. We can thus use these packages in the selection equation of a Heckman-inspired regression. Since the OP and SCP depend on the transfer fee, it might be better to look only at the SSP, as done in the next section.
Estimation strategy
Our standard hedonic model uses a log-linear equation to determine the dependent variable's value based on the player's skills, personal traits, and control factors, i.e.,:
Regressing the transfer fee (as previous studies have done) brings many significant results (Table 4): the transfer price is an increasing function of the duration of the contract, internet visibility or google hits (although not significant at the world level), player skills, buying club transfer activity, and is negatively affected by the end of the contract (‘free transfer’) or the transfer being a loan. There are some consistency problems when the regression is broken down by continent or position: for instance, yearly dummies tend to be significant and negative at the world level but not necessarily at the continent level (except for Latin America), while the Chow test shows that the disaggregated model is better. The Breusch-Pagan statistics (‘Chi2’) indicate heteroscedasticity, which disaggregation cannot reduce. While this does not make the model irrelevant, it means that the regression does not well render the granularity of the data. We find that the r-square is largest in the case of Europe relative to that of the global sample. For Latin America and USA/China it is relatively lower. There is a better fit for the case of European data.
Regression with the transfer fee being the dependent variable.
Figure 3 is the scatterplot between actual transfer values and predicted values. The fit is quite good here, maybe not as good as Poli et al. (2017), who claimed that their models gave R2 above 80%. This is to say, their model provides an account for more than 80% of the observed differences in the observed transfer fees (or, more precisely, the squares of the logarithms of those fees). This is a very good result for a statistical estimation procedure.

Scatterplot between actual transfer values and predicted values.
Using the packages to look at the breakdown by continent gives the same kind of results as with the transfer fee, while the Breusch-Pagan statistic is significantly lower to a point where heteroscedasticity can disappear in some instances (Table 5). The yearly dummies are consistent between the global market and the European market but not with other continents: although these dummies may be interpreted as a price index due to inflation over time (see Figure 1 for the Big Five), it seems that the price of transfers is not evolving consistently across continents; hence, breaking down the regression is required, as the Chow test shows.
Global model + breakdown by continent for objective package / subjective complete package / subjective salary package.
Robust standard errors in parentheses; ***p<0.01, **p<0.05, *p<0.1.
Breaking down by positions is performed only with the subjective complete package (SCP): while the result is consistent with other packages, SCP almost removes heteroscedasticity. This indicates not only that the estimated coefficients are unbiased, but also that the segmentation may be relevant to the resolution of the data. Table 6 is devoted to the subset of forward players: while there is heteroscedasticity, the Chow test indicates the aggregate model is better than a breakdown by continent. Unsurprisingly, the value of forward players’ transfer is linked to their specific skills such as “shooting”, “dribbling” and “passing” rather than other skills, which are more related to other positions. In addition to those skills variables, some other factors were significant, like some age variables. The involvement of the buying and selling clubs in the transfer business was also noticeable through appropriate variables. We again notice that the remaining duration of the contract before a transfer deal is reached is positively significant. While a Chow test does not favour disaggregation by continent, a separation between England and the rest of the world makes sense. Interestingly, England favours more “defending” and “physicality”, in contrast with the world level. This is consistent with the perception that England is a highly competitive and intense league (Bradley et al., 2016).
Breakdown by position and continent for the subjective complete package – forward players (strikers).
Robust standard errors in parentheses.
***p<0.01, **p<0.05, *p<0.1.
The same feature is true for the defenders and defensive midfielders (Table 7): the best segmentation (according to Chow tests) is England vs the rest of the world. Unsurprisingly, the defending and physicality skills are valued. The English market is more visible since Google Trends has a positive impact on player value. The other variables have the same impact as the forward players. Defensive midfielders have been added to defenders as the result of another Chow test.
Breakdown by position and continent for the subjective complete package - defenders and defensive midfielders.
Robust standard errors in parentheses.
***p<0.01, **p<0.05, *p<0.1.
For (non-defensive) midfielders (Table 8), there is a higher heteroscedasticity than for other positions. Not only the English but also the Italian market as well are singled out: there seem to be some singularities both in the appreciation of the players (height is preferred, youth is an asset) and their situation (loans seem to be priced like regular transfers of shorter duration). England displays a significant negative impact of the loan variable with a coefficient higher in absolute value than the one at the world level. This result can be due to the intention of clubs to lend their players, even at little or no cost, to the clubs, to put them on temporary display to sell them more easily. Clubs outside the Premier League may also favour a loan of their players in the latter rather than in their domestic league so that they do not take the risk of strengthening their domestic competitors (Feuillet et al., 2021). The number of observations may be too small, and the interest of the regression is more in showing the difference with the rest of the world than in specifying a very precise model.
Breakdown by position and continent for the subjective complete package - non-defensive midfielders.
Discussion and conclusion
Main findings
The analysis of football transfer fees in this article reveals a complex interplay of factors influencing their value, highlighting the need for a nuanced understanding that goes beyond simple aggregate models. While contract duration, internet visibility, player skills, and buying club activity consistently demonstrate a positive correlation with transfer fees, the significance and relative importance of these factors exhibit considerable geographic variation. The superior fit of the model for European data compared to other regions underscores the need for regionally specific analyses, acknowledging that the global market for football players is not monolithic.
Heteroscedasticity, a persistent challenge in the regression analysis, underscores the limitations of a purely aggregate approach. While disaggregating the data by continent partially addresses this issue, the unequal variance of errors suggests that unobserved heterogeneity, possibly related to specific market dynamics or institutional factors, remains a significant influence on transfer fee determination. Further research should delve deeper into these contextual factors to refine the model and improve predictive accuracy.
The incorporation of yearly dummy variables acknowledges the temporal dimension of transfer values, suggesting that inflation or other macroeconomic factors are at play. However, the inconsistent temporal patterns across continents indicate that these influences are not uniformly distributed globally. This highlights the need to account for regional economic conditions and their specific impacts on the market for football players. Analysing transfer fees by player position reveals crucial insights into the relative importance of specific skills. For forward players, shooting and dribbling abilities are particularly significant, while defensive players’ values are more strongly influenced by their defensive skills. Even within positions, regional variation is evident, with England often exhibiting distinct characteristics compared to the rest of the world. This underscores the limitations of a one-size-fits-all model and the importance of regional context.
The unique characteristics of the English market in loan deals, showing a negative correlation for non-defensive midfielders, highlight potential strategic market behaviour, consistent with Feuillet et al. (2021). The practice of loaning players at little or no cost, primarily within the Premier League, suggests a strategy of showcasing players for potential future sales. This behaviour contrasts with markets outside the Premier League, implying that strategic considerations and differing risk tolerances are crucial elements of the transfer market. Future research should explore these strategic aspects of player transfers more deeply, for example by building from Feuillet et al. (2021) and our own study.
In conclusion, this research offers valuable insights into the multifaceted determinants of football transfer fees. While broadly identifying key variables, the study emphasizes the critical role of regional context and player position, revealing inconsistencies that require region- and position-specific models to fully understand the dynamics at play. The presence of heteroscedasticity necessitates further investigation into unobserved factors and potentially more sophisticated modelling techniques to achieve a more comprehensive understanding of this complex market.
Practical (managerial/policy) implications
Evaluating players based on their performance statistics is not a new approach. Studies on the determinants of transfer fees followed a similar path while evaluating players based on subjective scores given by experts (used in video games like PlayStation games). Clubs also use this type of analysis when considering buying or selling a player, players and/or their agents could also use it to understand whether a selling club asks for a realistic transfer fee. Besides, football governing bodies (e.g., FIFA) could use it to assess whether a transfer fee is realistic or if further investigation is required if it is not realistic. It's noteworthy to mention that some missing variables (remaining duration of contracts) in previous studies happen to be influential in using video gaming data. Moreover, the loans and free transfers were present, through their dummy variables, for the first time in a study making the pricing function of this research unique and comprehensive. The controversy about the greed of agents by media houses and media, blaming them for the hike in prices of transfer fees and salaries, made it crucial to investigate the significance of the agent's dummy variables, which proved such allegations insignificant, while buying clubs’ financial strength was proved significant.
Theoretical implications
The findings of this paper do have some theoretical implications. The use of video gaming data to price football transfers may allow for the incorporation of new factors that affect a player's value, such as in-game performance, popularity among fans, and overall impact on the team. This implies the reciprocal benefits of using both actual sports data and video gaming data. This can lead to a more comprehensive understanding of a player's value and potentially more accurate pricing. The implication is that newer techniques that incorporate both types of data are becoming relevant and may be effective in pricing football transfers. For instance, the use of machine learning techniques, such as nonlinear regression methods, may show promise in improving the accuracy of pricing football transfers. These techniques can help identify complex relationships between variables and potentially outperform old methods. It is again possible to use a comprehensive global analysis based on video gaming data and distinguishing different segments compared to previous literature.
Limitations and directions for future research
While the data used in this paper is certainly rich, it has limitations. We acknowledge that some variables in our regressions, such as goals and Champions League appearances, may be jointly determined, raising concerns about multicollinearity and endogeneity. Our analysis focused on predictive accuracy rather than strict causal inference, and we interpreted coefficient estimates descriptively. We performed correlation and VIF checks before estimation and found no evidence of problematic multicollinearity.
The completeness of the data used to price football transfers using video gaming data may be limited. The data may not capture all relevant factors that affect the pricing function of a football player, such as injuries, team dynamics, and player behaviour off the field. The use of video gaming data to price football transfers may not be generalizable to all football players and teams. The data may only be representative of a specific subset of players and teams, and the results may not be applicable to other contexts. The use of video gaming data to price football transfers may lack transparency, as it may not be clear how the data is being used to determine the value of a player. This lack of transparency may lead to scepticism and mistrust among stakeholders in the football industry. To address these challenges, we suggest some new directions for future research. Future research could incorporate additional data sources, such as social media data, to improve the accuracy and completeness of the data used to price football transfers. New machine learning techniques could be developed to better analyse and interpret the data used to price football transfers. These techniques could help to identify new factors that affect the pricing function of a football player and improve the accuracy of transfer fee estimates. To address concerns about the lack of transparency in the use of video gaming data to price football transfers, future research could focus on developing more transparent and explainable models. This could help to build trust and confidence among stakeholders in the football industry. Future research could also consider the ethical implications of using video gaming data to price football transfers. This could involve examining issues such as data privacy, fairness, and bias, and developing guidelines and best practices for the responsible use of football data. Addressing post-pandemic effects is an important direction for future research, as the current paper covered the periods up to the 2019/2020 season.
In terms of future model refinement to enhance predictive accuracy and interpretability, the choice of model depends on the nature of the data, the complexity of relationships between variables, and the specific goals of the analysis. Machine learning models are particularly promising for improving predictive accuracy in transfer fee estimation, while advanced regression methods like GLM and fixed effects models offer robust alternatives to address statistical limitations of OLS. Finally, it is possible to explore threshold effects and ascertain the optimal age for footballers in terms of their transfer fees. We were unable to do this, probably because of the different packages. However, it is important to note that this is very feasible with actual transfer fees. Future research could build from our analysis and apply these directions for further model refinement.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
