Google Search Interest versus Self-Report Surveys: Examining Predictors of Gambling Expenditure in Australia

Abstract

Gambling is a major public policy issue in Australia, with Australians losing more money per capita on gambling than any other country. Given growing concerns over gambling expenditure, understanding how to predict consumer spending on gambling is crucial. This study compares the predictive power of Google search interest and self-reported data in estimating gambling expenditure. Analysing data from a range of secondary sources, we assess gambling behaviour across type (sports betting) and brands (BetEasy, Ladbrokes, Sportsbet) at the national and state level. Our findings indicate that Google search interest is a reliable predictor of gambling expenditure, providing real-time insights that align with actual spending patterns. In contrast, self-reported data show greater variability. Google search interest offers a stable, objective and cost-effective alternative to self-report surveys, enabling real-time monitoring of consumer behaviour and long-term trend analysis. These findings highlight its value for researchers and policymakers in understanding gambling behaviour to monitor the effectiveness of regulatory and harm reduction strategies.

Keywords

gambling self-report survey Google search regression models public policy

Introduction

Gambling is globally recognised as a public health issue, with harms that extend beyond individuals to affect families and communities through outcomes such as job loss, violence, relationship breakdown, suicide, education disruption and homelessness (Allami et al., 2021; van Schalkwyk et al., 2021). A significant driver of these growing harms is the rapid growth of online gambling, projected to increase from a global market value of $4.9 billion in 2024 to $8.5 billion by 2033 (IMARC, 2024), due to technological advancements, mobile device proliferation and greater internet accessibility (Australian Institute of Health and Welfare, 2023; Badu et al., 2023; S. M. Gainsbury et al., 2013; Hing et al., 2017; S. L. Thomas et al., 2022; S. Thomas et al., 2023a, 2023b). Australia is particularly impacted, with 21.2% of the population engaging in online gambling at the end of 2024 (Jenkinson et al., 2020), a trend accelerated by the COVID-19 pandemic (Australian Communications and Media Authority [ACMA], 2022). With the country also ranking among the highest in global per capita gambling expenditure of approximately $25 billion annually (AIHW, 2024), concerns are growing about the financial and social harms associated with the widespread availability of gambling, cultural normalisation and heavy industry advertising and sponsorship (S. M. Gainsbury et al., 2013; Hing et al., 2017; S. Thomas et al., 2023a, 2023b). In light of these trends, there is a critical need for reliable, real-time data to predict gambling behaviour, which can be used by policymakers, researchers and public health professionals to evaluate the effectiveness of harm reduction public health campaigns and education (e.g. community workshops, mass media campaigns) and regulatory changes (e.g. restrictions of advertising, bet limits, credit card use).

While traditional survey data offers valuable insights into self-reported gambling behaviour, self-report is often limited by delays in data availability, bias and high resource demands (Howse et al., 2022). Additionally, response rates have either remained low or declined over time, which threatens the accuracy and representativeness of the data collected (Mölenberg et al., 2021). In contrast, Google search data offers a real-time, unobtrusive and population-level behavioural signal that can reflect public interest and intent around gambling activities. However, the relative accuracy and reliability of such data for predicting actual gambling behaviour remains underexplored. This study addresses this gap by triangulating the predictive power of Google search trends and self-reported gambling survey data against actual gambling expenditure figures. Our study extends beyond the work of Houghton et al. (2023), who analysed Google Trends data in the UK during the COVID-19 pandemic, by adopting a broader, predictive approach across nine time points: 2011, 2015, 2018 to 2024, in the Australian context. Unlike their correlation-based method, we apply modelling approaches to compare Google search interest and self-reported gambling data in predicting actual gambling expenditure. Additionally, our study segments gambling behaviour by both type (sports betting) and specific brands (Sportsbet, Ladbrokes, EasyBet) across the national and state level, providing a more nuanced analysis of consumer gambling behaviours.

Self-reported gambling surveys: Limitations

Self-reported gambling data is often influenced by social desirability bias, where respondents might under-report behaviours perceived as undesirable (Kuentzel et al., 2008; Schell et al., 2021). Individuals may also under-report their gambling activities due to stigma (Brown & Russell, 2020) or legal concerns (S. Gainsbury et al., 2013). Self-reported measures are subject to recall bias and may not accurately capture past behaviours or the intensity of interest (Heirene et al., 2022). Self-reported surveys may have limited sample sizes or exclude certain populations, such as individuals from lower socioeconomic backgrounds and minority ethnic groups, due to sampling constraints (Goldstein et al., 2017; Rong & Wilkinson, 2011; Woodside, 2011) resulting in bias. Overall, there are ethical concerns associated with survey research in behaviours such as gambling, where confidentiality and data protection are critical considerations (Griffiths & Whitty, 2010). Alternatively, Google search interest can potentially overcome some of these ethical and methodological challenges of self-reported surveys.

Google trends: Search interest

Google’s dominance in the search engine market is unparallelled, capturing approximately 81.95% of the global market share (Statista, 2024). The volume of search queries processed by Google – around 99,000 per second, 8.5 billion per day and nearly 2 trillion per year (Flensted, 2024) – provides a rich dataset that reflects real-time consumer behaviour. Previous marketing research has used Google to obtain insights (Soutar & Murphy, 2009). Google data is made publicly accessible through Google Trends (www.google.com/trends), for a specific search term of interest. This metric is standardised relative to all other search terms within a specified location and within a specific time-period (Houghton et al., 2023). Google then rescale the resulting estimates to assign a search interest index in the range of 0 to 100 based on the search’s popularity compared to all searches on all topics (Houghton et al., 2023).

Google search interest: Advantages

Google search interest has numerous advantages over self-reported measures, including improved accuracy, broader scope, real-time analysis capabilities and practical benefits such as cost-effectiveness and privacy considerations. As Google search interest data is an aggregate of actual behaviour, it is not subject to self-report biases and may provide a more accurate reflection of consumer gambling behaviour. Search interest data can capture real-time fluctuations in interest towards various behaviours such as gambling, offering precise measurements that reflect current trends and patterns. Google search interest data encompasses a broad audience, providing insights across diverse demographics and geographies. This wide coverage offers a comprehensive picture of consumer interest. Gathering Google search interest data is generally less resource-intensive than conducting large-scale surveys (Markham et al., 2014). As such, we argue that Google search interest is a cost-effective method for continuous monitoring of gambling trends and is especially useful for researchers and organisations with limited budgets. Since search interest data is aggregated and anonymous, it respects individual privacy and eliminates the risk of personal data exposure.

Google Search Interest: Predictive Power

Research has increasingly leveraged search query data to forecast real-world behaviours across a wide range of domains. For instance, Johnson et al. (2004) demonstrated that tracking visits to flu information websites could be used to predict influenza outbreaks. Similarly, Ginsberg et al. (2009) found that searches for flu-related terms could anticipate outbreaks up to 2 weeks before identification by the Centres for Disease Control and Prevention (CDC), highlighting the timeliness and predictive value of online search behaviour in public health monitoring. Beyond epidemiology, Google Trends data has also proven valuable for economic forecasting. Choi and Varian (2012) found that Google Trends can be used to forecast first claims for unemployment, the Consumer Confidence Index in Australia and international visitor numbers to Hong Kong. Wu and Brynjolfsson (2015) showed that search interest for terms related to home and car purchases were strong predictors of actual sales trends. Additionally, search interest data has been used to anticipate first-weekend box office revenues, video game sales and music chart rankings (Goel et al., 2010). A growing body of marketing research has demonstrated the strong predictive value of Google Trends for understanding consumer behaviour across a wide range of contexts including forecasting the adoption of new technologies such as the iPhone and iPad (Chumnumpan & Shi, 2019), measuring the impact of advertising on driving actual sales (Hu et al., 2014), identifying shifts in consumer preferences for specific product features such as fuel economy (Du et al., 2015) and tracking brand equity to forecast company revenue (France et al., 2021). Collectively, these studies make a compelling case for Google Trends as a powerful and versatile tool for capturing consumer interest and enhancing behavioural forecasting in marketing research.

In the context of gambling behaviour, Houghton et al. (2023) examined the relationship between Google Trends data and operator gambling data by conducting cross-correlations during COVID-19 lockdowns (March 2020 to March 2022). Their study analysed Google monthly searches alongside operator-reported data, such as the number of active players per month, total bets placed per month and gross gambling yield across sports betting, slots and poker. Search data showed a strong correlation with commercial behavioural data for sports betting and poker but not for slots gambling. This suggests that search trends may serve as a potential indicator of shifts in gambling behaviour for specific types of gambling but may be less reliable for others.

While Houghton et al. (2023) provided valuable insights into the relationship between Google Trends data and gambling behaviour, their study primarily focused on establishing correlations between search interest and operator-reported gambling activity. However, their research did not compare Google Trends data to self-reported gambling measures, leaving a critical gap in understanding whether search data offers a superior or complementary approach to traditional survey methods. Our study builds on this foundation by directly testing the predictive validity of Google search interest against self-reported gambling behaviour, providing a more rigorous evaluation of its usefulness as a consumer behaviour monitoring tool. Furthermore, we extend the timeframe and scope of analysis beyond COVID-19 (pre-, during and -post), potentially uncovering longer-term trends and patterns that were not explored in prior work. Through these contributions, our study offers both methodological advancements and practical implications for regulators and policymakers seeking to track gambling trends more accurately.

Method

We analysed Google Trends data to examine public search interest related to gambling in Australia. Searches were conducted using trends.google.com, where relevant gambling-related keywords (e.g. ‘gambling’) were entered, and results were filtered by geographic region (Australia), time range (calendar year) and search type (web search). Data were captured for nine calendar years: 2011, 2015, 2018, 2019, 2020, 2021, 2022, 2023 and 2024. For each year, we used the ‘full year’ function in Google Trends to download weekly search interest data as CSV files. From this weekly data, we calculated an annual average search interest score for each term and year.

To obtain state-specific search interest, we used the ‘sub-region’ function within Google Trends, which allows users to break down search interest by geographic region within a country. By selecting Australia as the target region and specifying the desired time period and search term, we extracted relative interest values by state (i.e. New South Wales, Victoria, Queensland, South Australia, Western Australia, Tasmania and Northern Territory). These values are reported by Google Trends as a normalised index (0–100) based on the proportion of total searches in each state, allowing for valid cross-regional comparisons within each year (see Appendix 1 for visual representation of Google Trends).

Search terms were selected based on prior work by Houghton et al. (2023) and included keywords associated with prominent gambling categories and operators identified in the Gambling Research Australia (2021) report. The final list of terms included: ‘Gambling’, ‘Sports Betting’, ‘BetEasy’, ‘Ladbrokes’ and ‘Sportsbet’.

To contextualise these trends in consumer interest, we compiled self-reported behavioural data from multiple nationally representative sources. These included the Household, Income and Labour Dynamics in Australia (HILDA) Survey for the years 2015, 2018 and 2022; the Australian National University’s Gambling in Australia (ANU) survey for 2019 through 2024; the Australian Institute of Health and Welfare (AIHW) National Gambling Trends Study for 2022; and the Gambling Research Australia (GRA) study for 2011, 2019 and 2021. Each source is treated as a distinct observation in the dataset to preserve differences in sampling and measurement.

Although multiple behavioural indicators were available across sources, including product-specific gambling, online versus venue mode, time spent gambling and Problem Gambling Severity Index (PGSI) scores, these measures were not consistently defined or available across all years (see Table 1). To enable comparison across time, we focused on overall gambling participation as our key self-report behavioural measure, as it is the most consistently reported and conceptually aligned indicator across sources.

Table 1.

Data Availability by Source.

Year	Participation (binary)	PGSI	Mode (online vs venue)	Time spent gambling
2011	GRA	GRA	-	-
2015	HILDA	-	-	-
2018	HILDA	-	-	-
2019	GRA	GRA	GRA
2020	ANU	ANU	-	-
2021	ANU, IGS	ANU, IGS	GRA	-
2022	HILDA, AIHW	HILDA, AIHW	AIHW	-
2023	ANU	ANU	ANU	ANU
2024	ANU	ANU	ANU	ANU

National gambling expenditure data were sourced from the Australian Gambling Statistics (38th edition), published by the Queensland Government Statistician’s Office, covering the years 1995 to 96 to 2020 to 21. We used total gambling expenditure, defined as net losses to consumers in Australian dollars (AUD), that is, total wagers minus winnings (see Table AUS 5, p. 313, Australian Gambling Statistics). This measure offers a more accurate indicator than turnover, which includes recycled winnings and can substantially overstate actual consumer losses. For 2022 and 2023, national expenditure values were supplemented by media reports citing updated figures from government sources (Equity Economics, 2025; QGSO, 2024). The 2024 figure is a projection based on industry per capita forecasts (IBISWorld, 2024).

In addition to overall gambling expenditure, we specifically examined wagering expenditure (i.e. expenditure on sports betting) to better align with the product-level analysis conducted in our research. Data analysis were conducted using RStudio (version 4.4.0). All data files and the accompanying data dictionary are available via the Open Science Framework. Table 2 summarises all sources, reporting years and key variables included in the final dataset.

Table 2.

Self-Reported Gambling Behaviour: Data Sources and Variables.

Data source	Year	Sample size	Key variables	Variable definition	Notes
HILDA	2015	13,445	Gambling participationMonthly Spend	Binary indicator of past-year participation in 10 gambling types (e.g., lotteries, sports betting); self-reported monthly gambling expenditure	Self-completed questionnaire; see HILDA 2024 Report pp. 151–152
HILDA	2018	13,646	Gambling participationMonthly Spend	Binary indicator of past-year participation in 10 gambling types (e.g., lotteries, sports betting); self-reported monthly gambling expenditure	Self-completed questionnaire; see HILDA 2024 Report pp. 151–152
HILDA	2022	14,814	Gambling participationMonthly spendPGSI	Binary indicator of gambling participation; monthly spend; PGSI (Problem Gambling Severity Index, 0–27) to assess gambling harm severity	See HILDA 2024 Report pp. 151–154
ANU Gambling Survey	2020	3,000	Participation, frequency, PGSI	Binary indicator of past-year gambling participation across multiple activity types (e.g., lotteries, sports betting, EGMs); gambling frequency by activity (e.g., weekly, monthly); PGSI (Problem Gambling Severity Index, 0–27) to assess gambling harm severity	Online panel; see Gambling in Australia 2024, pp. 4—6 and 7 to 10
ANU Gambling Survey	2021	3,500	Participation, frequency, PGSI	Binary indicator of past-year gambling participation across multiple activity types (e.g., lotteries, sports betting, EGMs); gambling frequency by activity (e.g., weekly, monthly); PGSI (Problem Gambling Severity Index, 0–27) to assess gambling harm severity	Online panel; see Gambling in Australia 2024, pp. 4–6 and 7 to 10
ANU Gambling Survey	2023	3,510	Participation, frequency, PGSI, time spent	Binary indicator of past-year gambling across gambling activities; self-reported gambling frequency; estimated time spent gambling (per week or session); PGSI (0–27) included to assess risk and harm	Online panel; see Gambling in Australia 2024, pp. 4–6, 7 to 10 and 12 to 15
ANU Gambling Survey	2024	3,500	Participation, frequency, PGSI, time spent	Binary indicator of past-year gambling across gambling activities; self-reported gambling frequency; estimated time spent gambling (per week or session); PGSI (0–27) included to assess risk and harm	Online panel; see Gambling in Australia 2024, pp. 4–6, 7 to 10 and 12 to 15
AIHW National Gambling Trends Study	2022	10,019	Participation (13 types), platform, frequency, PGSI, harm	Binary indicator of gambling participation across 13 types (e.g., pokies, sports betting); gambling frequency and expenditure; mode of access (online vs venue-based); PGSI (0–27) to assess gambling harm; additional items measured harms to self (e.g., financial, health, emotional) and harms to others (e.g., family, relationships)	Phone/online mixed mode. See Gambling in Australia 2024, pp. 7–9, 11 to 17, 22 to 27
Gambling Research Australian (GRA)	2021	15,000	Gambling participation, mode, PGSI, risk classification	Binary indicator of past-year gambling participation; gambling mode by product (e.g., online vs. venue); PGSI (Problem Gambling Severity Index, 0–27); risk classification (non-problem, low, moderate, problem gambler)	National online survey; see Final IGS Report 2021, pp. 16–18, 22 and 25
Gambling Research Australian (GRA)	2011	Not reported	Gambling participation, PGSI risk	Binary indicator of past-year gambling participation; PGSI categories (non-problem, low risk, moderate risk, problem gambler), derived from retrospective self-report	Baseline comparison; retrospective recall IGS Report 2021, pp. 18, 25
Gambling Research Australian (GRA)	2019	5,019	Gambling participation, PGSI risk, mode	Past-year gambling participation, PGSI classification (Problem Gambling Severity Index) and gambling mode (online vs. venue) by product	Coss-sectional survey; longitudinal cohort; see IGS Report 2021, pp. 18, 22, 25

Results

We aimed to examine the impact of Google search interest in ‘gambling’ and self-reported gambling prevalence on national gambling expenditure across selected years (2011, 2015, 2018, 2019, 2020, 2021, 2022, 2023 and 2024). A multiple linear regression model was estimated to assess the relationship between standardised gambling expenditure and a set of standardised predictors: Google search interest, self-reported participation rates and a categorical variable representing time period. Given the number of time points, year fixed effects were not included to avoid overfitting and preserve model stability. Instead, a three-level period variable (pre-COVID = 2011, 2015, 2018, 2019; COVID = 2020, 2021; post-COVID = 2022, 2023, 2024) was introduced to capture broader temporal trends. This variable was dummy coded with the pre-COVID period serving as the reference category (coded 0). All continuous variables were standardised to facilitate interpretation of coefficients and ensure comparability across predictors. To account for potential heteroskedasticity in the residuals, robust standard errors (HC1) were applied. Variance inflation factors (VIFs) and influence diagnostics were also computed to assess multicollinearity and model sensitivity (analysis code is provided in Appendix 2). The model is expressed as:

{Expenditure}_{t} = β_{0} + β_{1} {GoogleSearch}_{t} + β_{2} {SelfReportPrevalence}_{t} + β_{2} {YearPeriod}_{t} + ε_{t}

The regression model explained a substantial proportion of the variance in standardised national gambling expenditure (R² = 0.868, adjusted R² = 0.780, p = .008. see Table 3). Under conventional estimation, Google search interest was a marginally significant positive predictor (β = .34, SE = 0.16, p = .081), while self-reported gambling prevalence was not significant (β = .04, SE = 0.16, p = .838). The model also indicated a significant increase in gambling expenditure during the 2022 to 2024 period (β = 2.10, SE = 0.38, p = .001), but not during the 2020 to 2021 period (β = .33, SE = 0.38, p = .413). To ensure the robustness of statistical inference, we re-estimated the model using robust standard errors. Under this specification, Google search interest emerged as a statistically significant predictor (β = .53, SE = 0.14, p = .012), while self-reported gambling prevalence remained non-significant (β = .07, SE = 0.11, p = .557). This finding underscores the relative reliability of Google search interest as a predictor of gambling expenditure, as its effect was stable and statistically significant across model specifications. Multicollinearity did not pose a concern (all VIFs < 1.3), and the Breusch-Pagan test indicated no evidence of heteroskedasticity (BP = 2.31, p = .679). Residual diagnostics supported assumptions of linearity and normality. Overall, the findings highlight the robustness of Google search interest as a predictor of national gambling expenditure and provide initial evidence of limited explanatory power of self-reported participation rates in this context.

Table 3.

Regression Analysis with Robust Standard Errors.

	Estimate	SE	t	p
Adjusted R²	0.780
(Intercept)	−0.85	0.31	−2.74	.034
Self-report	0.07	0.11	0.62	.557
Google search	0.53	0.15	3.57	.012
Year period 2	0.44	0.35	1.26	.254
Year period 3	1.99	0.54	3.71	.010

To extend our analysis, a second model was estimated using state-level data. Google Trends enables state-specific reporting of search interest, allowing us to examine whether variation in online search activity across states predicts corresponding differences in gambling expenditure. While state-level gambling expenditure figures were available for selected years (2011, 2015, 2018, 2019, 2020, 2021) from the Australian Gambling Statistics (see Tables NSW 5, VIC 5, QLD 5, SA 5, WA 5, TAS 5, NT 5, ACT 5), a key challenge was the lack of consistently reported self-reported gambling prevalence at the state level. To address this, we estimated state-level prevalence by proportionally allocating national self-reported gambling participation rates based on each state’s share of the national population (National, state and territory population [December], Australian Bureau of Statistics). This approach follows a proportional allocation method, which distributes values according to observed population proportions (Valiente Fernández et al., 2023). It assumes a uniform distribution of gambling prevalence across states after adjusting for population size. As such, the prevalence figures used in this model should be interpreted as imputed values derived through proportional allocation rather than direct observation (state population*prevalence percentage). Despite this constraint, the model enables a more geographically disaggregated examination of whether state-level variation in search interest and participation estimates predict corresponding gambling expenditure.

To account for regional variation while enabling generalisation across states, a linear mixed-effects model was estimated with a random intercept for state. This specification captures unobserved differences in baseline gambling expenditure across states while retaining the ability to estimate the overall effects of key predictors. The model examined whether standardised Google search interest in ‘gambling’ and standardised imputed self-reported gambling participation predicted standardised state-level gambling expenditure across six time points (2011, 2015, 2018, 2019, 2020 and 2021).

Again year fixed effects were not included to avoid overfitting and preserve model stability. Instead, a two-level categorical period variable was included to account for broader structural shifts in gambling behaviour. This variable distinguished between pre-COVID (2011, 2015, 2018, 2019) and COVID (2020, 2021). It was dummy coded with the pre-COVID period serving as the reference category (coded 0). The model is expressed as:

{Expenditure}_{i j} = β_{0} + β_{1} {GoogleSearch}_{i j} + β_{2} {SelfReportPrevalence}_{i j} + β_{3} {YearPeriod}_{i j} + u_{j} + ε_{t i j}

where u_j ∼ N (0,σ_u²) represents the random intercept for state j, and ε_ij ∼N(0,σ²) denotes residual error. The model explained a substantial proportion of variance in standardised gambling expenditure (conditional R² = 0.940), with fixed effects alone accounting for 24% of this variation (marginal R² = 0.237). The intraclass correlation coefficient (ICC = .921) indicated that a large proportion of the variance was attributable to between-state differences. Google search interest remained a statistically significant positive predictor (β = .14, SE = 0.04, p = .003) and imputed self-reported gambling participation also emerged as a significant predictor (β = .39, SE = 0.15, p = .016). The time was marginally significant (β = −.11, SE = 0.06, p = .069), suggesting a modest decline in gambling expenditure during the COVID period (2020 and 2021). See Table 4 for full results. These findings suggest that both Google search interest and imputed self-reported gambling participation are predictors of gambling expenditure in a state-level context, even after accounting for regional differences through a random intercept structure.

Table 4.

Random Effects Model – Overall Gambling Expenditure.

	Coef.	SE	t	p
(Intercept)	0.05	0.25	0.20	.856
Self-report (Imputed)	0.39	0.15	2.59	.016
Google search	0.14	0.04	3.18	.003
Year period 1	−0.11	0.06	−1.87	.069
Marginal R²	0.237
Conditional R²	0.940
Variance components
State (σ_υ²)	0.468
Residual (σ_ε²)	0.040
Number of states	8
Number of observations	56
Log liklihood	−10.641

To assess multicollinearity, we examined correlations among predictors and the size of standard errors. The strongest correlation was modest (r = −.44 between Google search interest and time period), and all standard errors remained within interpretable bounds. These diagnostics suggest that multicollinearity was not a concern. Model log-likelihood was −10.64 (df = 6), and residual diagnostics indicated no violations of model assumptions.

Examining by type and Brand

To further assess the predictive value of Google search interest and self-reported gambling behaviour, we turned our attention to sports betting, one of the most widely participated and rapidly growing gambling categories in Australia (Jenkinson et al., 2020). Due to the limited number of national-level time points (n = 6), which constrained statistical modelling, we again adopted a state-level analytical approach to maximise the number of observations while preserving percentage prevalence as a standardised behavioural indicator across states. We examined two forms of search interest: overall interest in the gambling type of sports betting (indexed by the Google Trends term ‘Sports Betting’) and brand-specific interest in major online betting providers (Google Trends terms ‘BetEasy’, ‘Ladbrokes’, and ‘Sportsbet’). For both approaches, we modelled wagering expenditure as a direct proxy for sports betting expenditure at the state level. State-level expenditure figures for these product types were available from the Australian Gambling Statistics (see Tables NSW 5, VIC 5, QLD 5, SA 5, WA 5, TAS 5, NT 5, ACT 5).

While self-reported gambling participation by type was available for 2011, 2015, 2018, 2019 and 2022, corresponding expenditure data by type (wagering) was only available up to 2021. Therefore, the timeframe for this analysis was limited to 2011, 2015, 2018 and 2019.

A linear mixed-effects model was estimated to test whether Google search interest (‘Sports Betting’) and imputed self-reported participation predicted wagering expenditure across Australian states. The model included a random intercept for state (u_j ∼ N (0,σ_u²)) to account for repeated measurements within states over time, and residual error (ε_ij ∼N(0,σ²)). Google search interest was a significant positive predictor of wagering expenditure (β = .34, p < .001). The imputed self-reported participation variable reflects the proportion of Australians who reported engaging in sports betting in national surveys, proportionally allocated to each state based on population size. Self-reported participation, by contrast, was not a significant predictor (β = .03, p = .801). Intraclass correlation coefficients (ICC > .88) indicated substantial between-state variance. See Table 5 for model details.

Table 5.

Random Effects Model – Wagering Expenditure.

	Coef.	SE	t	p
(Intercept)	−0.02	0.38	−0.05	.964
Self-report (imputed)	0.03	0.11	0.25	.801
Google search	0.34	0.09	3.87	.001
Marginal R²	0.079
Conditional R²	0.891
Variance components
State (σ_υ²)	1.13
Residual (σ_ε²)	0.15
Number of states	8
Number of observations	32
Log likelihood	−60.60

To build on these findings, we next examined brand-specific Google search interest. Using data from the AIHW report, which is the only source to capture brand-level gambling behaviour, we focused on three brands with sufficient Google Trends coverage: Sportsbet, BetEasy and Ladbrokes. Unlike earlier models that used longitudinal data, the brand-specific analysis was limited to a single year (2019), as this was the only year for which brand-level data were reported. As such, mixed-effects modelling was not appropriate, and we instead assessed cross-sectional associations between brand-specific Google search interest and wagering expenditure across states.

To examine the combined predictive value of brand-specific Google search interest and self-reported brand use on wagering expenditure, we first estimated an overall regression model pooling data across all brands. Both predictors were positively associated with wagering expenditure: Google search interest emerged as a strong and statistically significant predictor (β = .44, p < .001), while imputed self-reported brand use showed a marginally significant effect (β = .16, p = .079). These results suggest that online search behaviour is a reliable indicator of actual wagering expenditure, even when aggregated across brands.

We then estimated separate models for Sportsbet, BetEasy and Ladbrokes. For Sportsbet, brand-specific Google search interest was a positive predictor of wagering expenditure (β = 1.47, p < .001) and self-reported use of the brand also contributed significantly (β = .34, p = .004). A similar pattern was observed for BetEasy, where both Google search interest (β = .92, p = .001) and self-reported brand use (β = .40, p = .049) were positively associated with wagering expenditure. The strongest effects were observed for Ladbrokes, where both predictors significantly explained variation in wagering expenditure: Google search interest (β = 1.12, p < .001) and self-reported brand use (β = 1.23, p < .001). The findings reveal that both Google search interest and self-reported brand use closely mirror patterns in actual wagering expenditure.

To ensure the validity of our regression models, we conducted standard diagnostic tests. Variance Inflation Factors (VIFs) were low across all models (all <1.3), indicating no multicollinearity concerns. Visual inspection of residual plots revealed no major deviations from linearity or normality. However, the Breusch–Pagan test indicated significant heteroskedasticity in several models (e.g. Sportsbet: BP = 23.04, p < .001; BetEasy: BP = 21.53, p < .001; Overall model: BP = 44.89, p < .001). Accordingly, we report heteroskedasticity-robust (HC1) standard errors for all coefficients to provide conservative and reliable inference. Importantly, the results remained statistically robust under these adjustments. Full model results are presented in Table 6.

Table 6.

Regression Analysis With Robust Standard Errors – Wagering Expenditure.

	Model 1: Overall				Model 2: Sportsbet				Model 3: BetEasy				Model 4: Ladbrokes
	Coef.	SE	t	p	Coef.	SE	t	p	Coef.	SE	t	p	Coef.	SE	t	p
Adjusted R²	0.203				0.436				0.314				0.951
(Intercept)	0.00	0.09	0.00	1.00	−0.99	0.17	−5.70	<.001	−0.35	0.10	−3.37	.002	1.57	0.07	22.13	<.001
Self-report (imputed)	0.16	0.09	1.76	.079	0.34	0.07	4.49	.001	0.40	0.18	2.27	.030	1.23	0.09	13.62	<.001
Google search	0.44	0.09	4.78	<.001	1.47	0.37	3.96	<.001	0.92	0.28	3.24	.003	1.12	0.02	74.32	<.001

General discussion

This study makes a methodological contribution to gambling research by comparing Google Trends search data with traditional self-report survey data to assess their relative accuracy in predicting real-world gambling expenditure. Using predictive modelling across multiple years and gambling type and brands, our findings provide strong empirical support for the utility of Google search interest as a reliable, real-time indicator of gambling behaviour. Across both national and state-level analyses, Google search interest consistently emerged as a reliable predictor, whereas self-reported gambling participation showed limited or inconsistent associations with gambling expenditure. Further, the predictive power of Google search interest extended to the specific gambling type of sports betting, and to individual sports betting brands including Sportsbet, BetEasy and Ladbrokes. In contrast, self-reported participation was only predictive of actual gambling expenditure at the brand usage level. See Table 7 for a summary of our research questions and findings.

Table 7.

Summary of Research Questions and Findings.

Research question	Method/data used	Key finding
Can Google Trends predict gambling expenditure?	Predictive modelling using Google search interest (national, state, type, brand)	Yes; consistently significant across all models and levels of analysis
How does self-report survey data compare?	National surveys (2011, 2015, 2018, 2019, 2022); imputed values for states and brands	Weaker predictor overall; not significant nationally, moderately useful at state and brand-specific levels
Do results vary by gambling type or brand?	Segmented mixed-effects and regression analyses by gambling type and brand	Yes; Google Trends outperforms self-report for gambling type, and both are significant at brand level

This study extends the work of Houghton et al. (2023), which examined correlations between Google Trends and gambling operator data but did not evaluate self-report survey data. By directly testing the predictive validity of both data types, our findings suggest that Google search interest provides a more stable and objective measure of gambling expenditure, likely due to its ability to capture real-time consumer interest without the biases that often affect self-reported data. These results hold important implications for researchers, policymakers and public health professionals seeking more accurate tools to monitor gambling trends and harm reduction strategies.

Contributions

This study addresses a long-standing methodological limitation in gambling research: the field’s heavy reliance on self-reported data for monitoring consumer gambling behaviour (e.g. Schell et al., 2021). While surveys remain useful for understanding motivations and individual experiences, they are subject to bias, reporting delays and availability constraints (Heirene et al., 2022; Kuentzel et al., 2008). By demonstrating that Google search data is a stronger and more stable predictor of actual gambling expenditure, this study highlights the value of search data as a complementary or alternative method for understanding consumer gambling behaviour.

Beyond gambling, our findings contribute to the broader behavioural research literature by showing that population-level search data can improve the accuracy and responsiveness of research on risky or sensitive behaviours. This is particularly useful in areas such as substance use, mental health or disordered eating, where individuals may be reluctant to disclose behaviours through conventional data collection methods. By validating the link between Google search interest and actual expenditure in the context of gambling, our study strengthens the case for using search data as a proxy for real-world behaviour more broadly.

Implications

This study offers important implications for regulation and public health interventions. From a regulatory perspective, our findings highlight the potential for search data to inform gambling policy decisions. Our study suggests that search data could be incorporated into gambling monitoring requirements for governments. This data could then serve as an early warning system for identifying increased behaviour and expenditure on gambling activities following regulatory changes, allowing for real-time policy evaluation and adjustment to occur. For instance, the Netherlands’ 2021 legalisation of online gambling under the Dutch Gambling Authority has led to growing concerns about consumer protection and increased exposure to gambling risks, particularly among young people (CMS, 2024). Policymakers in countries such as the Netherlands, where online gambling has been recently legalised or expanded, could use Google search interest to monitor the implications of legislation on gambling behaviour. In contrast, countries such as Australia have increased regulations to reduce harms, such as the amendments to the Interactive Gambling Act 2001 to include a National Self-Exclusion Register in 2019 (ACMA, 2025). Google search interest can also be used to monitor the effectiveness of such consumer protection measures and adjust regulations accordingly.

The implications of this research also extend to harm reduction and public health strategies. Since we provide evidence that gambling-related searches predict actual gambling behaviour, public health agencies can leverage search trends to design and then evaluate targeted harm prevention campaigns. If increases in search volume for gambling-related terms are observed, governments and organisations could deploy targeted digital interventions, such as educational content, responsible gambling message or targeted advertisements for gambling support services. Additionally, search interest data could be used to optimise the timing and placement of harm reduction campaigns, ensuring that they reach at-risk individuals during periods of heightened gambling interest. The effectiveness of these initiatives could then be evaluated using search interest data. By integrating Google Trends into harm minimisation efforts, stakeholders can develop more proactive and data-driven approaches to addressing gambling-related harm.

Limitations and future research directions

Despite the valuable contributions of this study, several limitations remain. A major limitation of this study is the number of time points available for analysis. National-level gambling expenditure and self-reported survey data were only consistently available for a handful of years between 2011 and 2024. This constraint significantly restricted the complexity of the models we could specify, particularly with regard to controlling for yearly fixed effects. Including fixed effects for each year, for example, would have risked overfitting the model due to the small sample size, reducing the reliability and generalisability of the results.

Second, the use of secondary data, while practical and historically grounded, limits our ability to examine individual-level predictors or fine-grained temporal effects (e.g. during major sporting events or policy shifts). Access to individual-level behavioural data would allow for deeper exploration of causal mechanisms, though such data is often unavailable for retrospective analysis. Methodological refinements in the analysis of aggregate data, such as the use of robust standard errors to address heteroskedasticity, cross-validation to assess model stability and fixed effects models to control for unobserved heterogeneity, are crucial to mitigate these challenges and improve the robustness of findings derived from secondary data sources.

A further constraint of our research was the inability to include additional potentially confounding demographic factors such as income, age and gender composition. While these factors are known to influence gambling behaviour and expenditure (Allami et al., 2021), incorporating them would have required a much larger dataset with more time points to ensure statistical power and model robustness. Given the restricted sample size, adding more variables would have increased the risk of multicollinearity and model overfitting. Future research should incorporate a wider range of demography variables, ideally deploying longer time series or richer survey data, to disentangle the independent effects of digital behavioural signals and traditional self-reported measures.

Lastly, while we highlight the limitations of survey methods, future research could test improved designs, such as diary studies or short-interval recall, to assess whether more granular or frequent surveying better aligns with behavioural outcomes. Moreover, future work could explore how search interest relates not just to expenditure, but to market share, brand loyalty and advertising effects, especially in the context of external factors such as economic downturns, regulatory changes, or major gambling events.

Footnotes

Author contributions

Conceptualisation: JI, SB, KC; Methodology: JI, SB; Formal analysis and investigation: SB; Writing – original draft preparation: JI, SB, KC; Writing – review and editing: JI, SB, KC; Funding acquisition: N/A; Resources: JI, SB, KC; Supervision: JI, SB.

Ethical approval and informed consent statement

None required. This study uses secondary data that is publicly available.

Data availability statement

The dataset is available for download as a supplementary file.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

ORCID iDs

Jasmina Ilicic

Stacey Brennan

Katherine Cullerton

Supplemental Material

Supplemental material for this article is available online.

References

Allami

Hodgins

D. C.

Young

Brunelle

Currie

Dufour

Flores-Pajot

M.-C.

Nadeau

(2021). A meta-analysis of problem gambling risk factors in the general adult population. Addiction, 116(11), 2968–2977.

Australian Communications and Media Authority. (2022). Online gambling in Australia: Findings from the 2021 ACMA annual consumer survey. https://www.acma.gov.au/sites/default/files/2022-02/Online%20gambling%20in%20Australia.pdf

Australian Communications and Media Authority. (2025). About the Interactive Gambling Act. https://www.acma.gov.au/about-interactive-gambling-act

Australian Institute of Health and Welfare. (2023). Gambling in Australia. https://www.aihw.gov.au/reports/australias-welfare/gambling

Australian Institute of Health and Welfare. (2024). Australia’s health 2024: In brief. AIHW. https://www.aihw.gov.au/reports/australias-health/australias-health-2024-in-brief/summary

Badu

Hallett

Vujcich

Crawford

Bellringer

M. E.

(2023). Setting the scene: A scoping review of gambling research in Ghana. Health Promotion International, 38(6), daad171.

Brown

K. L.

Russell

A. M. T.

(2020). What can be done to reduce the public stigma of gambling disorder? Lessons from other stigmatised conditions. Journal of Gambling Studies, 36(1), 23–38.

Choi

Varian

H. A. L.

(2012). Predicting the present with Google Trends. Economic Record, 88, 2–9.

Chumnumpan

Shi

(2019). Understanding new products’ market performance using Google Trends. Australasian Marketing Journal (AMJ), 27(2), 91–103.

10.

CMS. (2024). Netherlands issues first evaluation of online gambling rules. https://cms-lawnow.com/en/ealerts/2024/11/netherlands-issues-first-evaluation-of-online-gambling-rules

11.

R. Y.

Damangir

(2015). Leveraging trends in online searches for product features in market response modeling. Journal of Marketing, 79(1), 29–43.

12.

Equity Economics. (2025). Gambling in Australia’s cost-of-living crisis: The black hole in household budgets. https://www.equityeconomics.com.au/report-archive/gambling-in-australias-cost-of-living-crisis-the-black-hole-in-household-budgets

13.

Flensted

(2024). How many people use Google? Statistics and facts. https://seo.ai/blog/how-many-people-use-google

14.

France

S. L.

Shi

Kazandjian

(2021). Web trends: A valuable tool for business research. Journal of Business Research, 132, 666–679.

15.

Gainsbury

Parke

Suhonen

(2013). Consumer attitudes towards Internet gambling: Perceptions of responsible gambling policies, consumer protection, and regulation of online gambling sites. Computers in Human Behavior, 29(1), 235–245.

16.

Gainsbury

S. M.

Russell

Hing

Wood

Blaszczynski

(2013). The impact of internet gambling on gambling problems: A comparison of moderate-risk and problem Internet and non-Internet gamblers. Psychology of Addictive Behaviors, 27(4), 1092–1101.

17.

Gambling Research Australia. (2021). The second national study of interactive gambling in Australia (2019-2020). https://www.gamblingresearch.org.au/sites/default/files/2021-10/Final%20IGS%20report%202021.pdf

18.

Ginsberg

Mohebbi

M. H.

Patel

R. S.

Brammer

Smolinski

M. S.

Brilliant

(2009). Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012–1014.

19.

Goel

Hofman

J. M.

Lahaie

Pennock

D. M.

Watts

D. J.

(2010). Predicting consumer behavior with Web search. Proceedings of the National Academy of Sciences, 107(41), 17486–17490.

20.

Goldstein

A. L.

Vilhena-Churchill

Munroe

Stewart

S. H.

Flett

G. L.

Hoaken

P. N.

(2017). Understanding the effects of social desirability on gambling self-reports. International Journal of Mental Health and Addiction, 15, 1342–1359.

21.

Griffiths

M. D.

Whitty

M. W.

(2010). Online behavioural tracking in Internet gambling research: Ethical and methodological issues. International Journal of Internet Research Ethics, 3(1), 104–117.

22.

Heirene

R. M.

Wang

Gainsbury

S. M.

(2022). Accuracy of self-reported gambling frequency and outcomes: Comparisons with account data. Psychology of Addictive Behaviors, 36(4), 333–346.

23.

Hing

Russell

A. M. T.

Lamont

Vitartas

(2017). Bet anywhere, anytime: An analysis of internet sports bettors’ responses to gambling promotions during sports broadcasts by problem gambling severity. Journal of Gambling Studies, 33, 1051–1065.

24.

Houghton

Boy

Bradley

James

Wardle

Dymond

(2023). Tracking online searches for gambling activities and operators in the United Kingdom during the COVID-19 pandemic: A Google Trends™ analysis. Journal of Behavioral Addictions, 12(4), 983–991.

25.

Howse

Cullerton

Grunseit

Bohn-Goldbaum

Bauman

Freeman

(2022). Measuring public opinion and acceptability of prevention policies: An integrative review and narrative synthesis of methods. Health Research Policy and Systems, 20(1), 26.

26.

R. Y.

Damangir

(2014). Decomposing the impact of advertising: Augmenting sales with online search data. Journal of Marketing Research, 51(3), 300–319.

27.

IBISWorld. (2024). Per capita gambling expenditure. https://www.ibisworld.com/australia/bed/per-capita-gambling-expenditure/4936/

28.

IMARC. (2024). Australia online gambling market size, share, trends and forecast by game type, device, and region, 2025-2033. https://www.imarcgroup.com/australia-online-gambling-market

29.

Jenkinson

Sakata

Khokhar

Tajin

Jatkar

(2020). Gambling in Australia during COVID-19. Australian Gambling Research Centre. https://aifs.gov.au/research/research-snapshots/gambling-australia-during-covid-19

30.

Johnson

H. A.

Wagner

M. M.

Hogan

W. R.

Chapman

Olszewski

R. T.

Dowling

Barnas

(2004). Analysis of Web access logs for surveillance of influenza. In Fieschi

Coiera

E. W.

(Eds.), MEDINFO 2004 (pp. 1202–1206). IOS Press.

31.

Kuentzel

J. G.

Henderson

M. J.

Melville

C. L.

(2008). The impact of social desirability biases on self-report among college student and problem gamblers. Journal of Gambling Studies, 24, 307–319.

32.

Markham

Young

Doran

(2014). Gambling expenditure predicts harm: Evidence from a venue-level study. Addiction, 109(9), 1509–1516.

33.

Mölenberg

F. J. M.

de Vries

Burdorf

van Lenthe

F. J.

(2021). A framework for exploring non-response patterns over time in health surveys. BMC Medical Research Methodology, 21(1), 37.

34.

QGSO. (2024). Australian Gambling Statistics. https://www.qgso.qld.gov.au/statistics/theme/society/gambling/australian-gambling-statistics

35.

Rong

Wilkinson

I. F.

(2011). What do managers’ survey responses mean and what affects them? The case of market orientation and firm performance. Australasian Marketing Journal (AMJ), 19(3), 137–147.

36.

Schell

Godinho

Cunningham

J. A.

(2021). Examining change in self-reported gambling measures over time as related to socially desirable responding bias. Journal of Gambling Studies, 37, 1043–1054.

37.

Soutar

G. N.

Murphy

(2009). Journal quality: A Google Scholar Analysis. Australasian Marketing Journal (AMJ), 17(3), 150–153.

38.

Statista. (2024). Market share of leading search engines worldwide from January 2015 to January 2024. https://www.statista.com/statistics/1381664/worldwide-all-devices-market-share-of-search-engines/#:~:text=In%20January%202024%2C%20the%20online,share%20of%20around%2091.47%20percent

39.

Thomas

van Schalkwyk

M. C. I.

Daube

Pitt

McGee

McKee

(2023a). Protecting children and young people from contemporary marketing for gambling. Health Promotion International, 38(2), daac194.

40.

Thomas

Cowlishaw

Francis

van Schalkwyk

M. C. I.

Daube

Pitt

McCarthy

McGee

Petticrew

Rwafa-Ponela

Minja

Fell

(2023b). Global public health action is needed to counter the commercial gambling industry. Health Promotion International, 38(5), daad110.

41.

Thomas

S. L.

Pitt

Randle

Cowlishaw

Rintoul

Kairouz

Daube

(2022). Convenient consumption: A critical qualitative inquiry into the gambling practices of younger women in Australia. Health Promotion International, 37(6), daac153.

42.

Valiente Fernández

García Fuentes

Delgado Moya

F. D. P.

Marcos Morales

Fernández Hervás

Barea Mendoza

J. A.

Mudarra Reche

Bermejo Aznárez

Muñoz Calahorro

López García

Monforte Escobar

Chico Fernández

. (2023). Could machine learning algorithms help us predict massive bleeding at prehospital level? Medicina Intensiva, 47(12), 681–690.

43.

van Schalkwyk

M. C. I.

Petticrew

Cassidy

Adams

McKee

Reynolds

Orford

. (2021). A public health approach to gambling regulation: Countering powerful influences. Lancet Public Health, 6(8), e614–e619.

44.

Woodside

A. G.

(2011). Responding to the severe limitations of cross-sectional surveys: Commenting on Rong and Wilkinson's perspectives. Australasian Marketing Journal (AMJ), 19(3), 153–156.

45.

Brynjolfsson

(2015). The future of prediction: How Google searches foreshadow housing prices and sales. In Goldfarb

Greenstein

S. M.

Tucker

C. E.

(Eds.), Economic analysis of the digital economy (pp. 89–118). University of Chicago Press.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.09 MB