Sage Journals: Discover world-class research

Abstract

While there has been much research into developing artificial intelligence (AI) techniques for fake news detection aided by various global benchmark datasets, it has often been pointed out that fake news in different geo-political regions traces different contours. In this work we uncover, through analytical arguments and empirical evidence, the existence of an important characteristic in news originating from the Global South viz., the geo-political veracity gradient. In particular, we conjecture that Global South news about topics from Global North—such as news from an Indian news agency on US elections—tend to be less likely to be fake, and provide three forms of support for the conjecture. First, observing through the prism of political economy, we posit a relative lack of monetarily aligned incentives in producing fake news about a different region than the regional remit of the audience. Second, we provide empirical evidence for this from benchmark datasets used in AI research on fake news detection. Third, we empirically illustrate this conjecture through observing its empirical effect in applying AI-based fake news detection models tested in a regional remit distinct from their training. Consequently, we point out how AI models trained in the Global North may encounter this gradient as a facet that enhances friction within Global South application contexts, creating predictions with less utility for the Global South. We locate our work within emerging critical scholarship on geo-political biases within media in the context of widespread application of AI in fake news identification. We hope our insight into the geo-political veracity gradient will help illuminate the latent geo-political anchoring within AI for fake news detection.

Keywords

misinformation fake news AI bias in AI geo-political bias systemic bias news accuracy

Introduction

It is well understood that popular data sources do not have an even distribution of data from across geo-political regions. In particular, there is often a dominance of Global North (e.g., North America and Europe) contexts within data, often rightly interpreted as a digital exclusion of other contexts (Graham & Dittus, 2022). Such data skews and biases have been shown to have significant impacts on computational tasks such as facial recognition (Jaiswal et al., 2024) and skin cancer detection (Pope et al., 2024). The contours of data biases are complex and different across various sectors and require nuanced analyses to uncover. For example, it has been pointed out that Global South (e.g., Asia, Africa, Latin America) representation in Wikipedia is often due to Global North authors (Graham & Dittus, 2022); this points to the need to study potential second-order biases that are not visible superficially.

Data biases are of significant consequence when it comes to media. As noted in Rambaldi (2022), when viewing “… media information as a field of knowledge control,” the development norms from the Global North embed themselves into the public's perception of reality. Fuchs (2010) identifies concentration and transnationalization as tendencies within the media that orient it towards developing a form of information and media imperialism. The rise of social media, virtually driven out of Global North (perhaps Silicon Valley more specifically) as far as large parts of the world are concerned, has added to the pace of Global North dominance. The dominance of Global North within media contexts within activism has been termed as “Northern Visibilities” by Özkula and Reilly (2024). This dominance, when it extends to cultural aspects, could be termed as cultural imperialism¹ and has significant real-world consequences. Chinmayi (2020) documents one instance where the friction between the design choices of Facebook and the social realities of Myanmar resulted in severe consequences during the Rohingya crisis. There are some notable challenges to such Global North hegemony such as Al-Jazeera (Seib, 2005), but these remain few and far between.

An aspect relating to media, the phenomenon of fake news has been consistently ranked as among the top global risks² within contemporary society. For purposes of this paper, we use fake news as a broad category comprising myriad kinds of misleading information (e.g., misinformation, disinformation, rumors, hoaxes, scams) that claim legitimacy of news. Fake news was particularly highlighted during the 2016 US election (Grinberg et al., 2019) and arguably has only intensified since. There has been tremendous interest in the artificial intelligence (AI) community in responding to this challenge, and fake news detection has emerged as a burgeoning area of AI research (Iqbal et al., 2023). Yet, due to issues such as data and media bias highlighted above, the AI capacity for fake news detection is aligned better with the needs of the Global North than those of the Global South. There is hardly any attention to the geo-political contours of fake news within scholarship of AI for fake news detection, resulting in Global North hegemony—and thus, cultural imperialism—substantively percolating into AI algorithms built to mitigate fake news. The complex contours of such biases have attracted recent attention, with a recent study highlighting the geo-political bias relating to the usage of emotions within techniques to combat misinformation (Deepak et al., 2024). It is imperative to uncover the points of injection of such geo-political biases to fuel efforts that seek an equitable impact of AI methods in resisting disinformation across global societies. We locate our work against the backdrop of this larger goal.

The focus of this interdisciplinary paper is the characterization of a consistent trend in Global South news, especially within data sources used to train AI models. We call this trend as the geo-political veracity gradient. We start by outlining this pattern formally as a conjecture, followed by three types of analyses of this trend. First, we outline factors from the political economy of media that could explain this trend. Second, we quantitatively outline the extent of this trend through empirical investigations of public datasets. Third, we empirically uncover the consequences of this trend for cross-regional usage of AI-based fake news detection. We conclude by drawing attention to the need to factor this trend into building AI-based methods for fake news detection.

Geo-political veracity gradient in Global South news

We outline the key insight of this paper as a conjecture upfront:

For a news article that satisfies two conditions:

It originates from a news source in the Global South, and

It addresses a topic that is substantively related to the Global North,

there is a higher than baseline chance that it is of high truthfulness/veracity.

It may be easier to illustrate this using examples. The conjecture suggests that if an India-based news source³ reports on events in the United States, it is more likely to be truthful reporting. Similarly, coverage from Chinese media on British elections are likely to be truthful. However, this conjecture doesn’t say anything about scenarios such as a Canadian news article reporting on Thailand elections.

In more abstract terms, if we split news from Global South into two distinct subsets viz., those on topics relating to Global North, and those reporting on Global South news, the conjecture suggests that it is likely that the latter would have a higher prevalence of misinformation than the former. The conjecture points to a significant structural geo-political veracity bias within news sources.

While we present this as a generalized conjecture, we do recognize that the categories of Global North and Global South encompass huge amounts of diversity, and the pattern stated above may not hold for certain combinations. As an example, the geographical proximity between South European nations and their former colonies which are now nations in the North African Maghreb region could create a mixing of media ecosystems which may affect the validity of this conjecture for such combinations. While these could lead to interesting research, this article does not account for such complexities.

Political economy of fake news

We now provide argumentative support for the conjecture based on insights from the political economy of media and fake news. We employ a critical political economy approach, considering the power relations, economic incentives and ideology within the production process of fake news. Viewing contemporary news production as underpinned by the ethos of capitalism, we illustrate how we may trace the connections from such production to the manifestation of the geo-political bias we have outlined above. This draws from the intellectual legacy of 19th century critiques of political economy (e.g., Marx (1857/1993)).

We begin with some background on contemporary media. Ad-driven media such as online and social media, which are increasingly becoming mainstream avenues for news consumption, are primarily governed by the notion of the attention economy. This involves the characterization of “human attention as a scarce but quantifiable commodity” (Crogan and Kinsley, 2012); this attention commodity is eventually monetized through advertising models, resulting in a proliferation of ads in online life. With the commodity form of attention being key, the forces of news creation are swayed to work in ways such that their news can amass as much user attention as possible. The intent of news production becomes no longer just to inform, but also to accumulate attention. This attention incentive progressively crowds out other factors such as news quality, predicating increasing prevalence of highly sensationalized (Hendriks et al., 2018) and clickbait-style content within our news streams.

The category of fake news enhances monetization by its alignment with an additional pathway viz., direct and upfront monetization of content creation. For example, a content creator could be paid by a movie producer for promoting their film, or by a competitor to slander the same. News writers could be funded to flatter or defame products or political actors. The financial motive stems from the urge to use fake news to influence opinions among people in the real world. We call this the opinion incentive in fake news.

The argument so far is summarized in Figure 1. In short, news in ad-funded media is swayed by the need to accumulate user attention, whereas paid fake news is additionally swayed by the need to influence reader opinions.

Figure 1.

Incentives in general news vis-a-vis fake news: a simplified view.

We now turn our attention to a qualitative distinction between the two incentives, attention, and opinion. First, we must look at the geo-localization of media as a background. A news source would naturally tend to attract audiences associated with its region or related regions. This is because news (or social media posts) is often about reporting on events in the physical world, and a local presence allows deeper local coverage, thus attracting more local audiences and enabling a positive feedback loop. Such localization of media audiences does not necessarily predicate a localization of general news content. There may be sufficient local interest (i.e., attention potential) in a geographically remote event. For example, when there is interest locally, Chinese media may post on the U.S. elections, and Latin American movie buffs may report on Hollywood movies. However, when it comes to the opinion incentive, such remote coverage becomes less justified. For example, it would serve little purpose to fund fake news about a U.S. presidential candidate within Chinese media, and any defamation of a Hollywood movie within Latin America would cause only a blip of damage, if at all any. In short, the opinion incentive works in a geographically localized way in a manner than the attention incentive doesn’t. Thus, fake news, due to being influenced by the opinion incentive, tends to be much more anchored to the local remit of the source, an effect indicated by the pin icon in Figure 1. This makes fake news much more geo-localized to the regional remit of the source than news in general.

Considering Global South and Global North as regions, the above reasoning could yield two kinds of outcomes viz. the conjecture in the previous section, and vice versa. The latter suggests that news about the Global South from Global North sources may be of higher veracity. However, the economic, political, and cultural hegemony of the Global North (Braff and Nelson, 2023), underpinned by the long colonial era, reflects in contemporary media as an interest towards Global North topics within the Global South, one that is not matched the other way to the same extent. Thus, the mirror image of the conjecture, while theoretically plausible, is not likely a significant or widespread effect.

Empirical evidence for geo-political veracity gradient

Towards providing empirical evidence for the geo-political veracity gradient conjecture, we assemble a dataset with four parts; each part containing real (R) or fake news (F) from either Global South (GS) or Global North (GN). These are indicated as GN-R, GN-F, GS-R, and GS-F. While the Global North dataset is simply the highly popular ISOT (Information Security and Object Technology Research Lab) dataset⁴ used heavily to benchmark AI techniques for misinformation, the Global South dataset is formed by news datasets in Indian contexts found on Kaggle⁵, along with the FakeNewsIndia dataset (Dhawan et al., 2022). These have been randomly downsampled to ensure that each of the four partitions has similar counts.

Table 1 illustrates the frequency of some few notable entities of interest in Global North spheres across the four data partitions, frequency measured as the fraction of articles containing the word/phrase. Apart from Global North geographical entities such as United States, Britain, Washington, Europe, and Japan, Table 1 includes surnames of notable political personalities, such as those relating to current and past U.S. presidents. The appearance of these entities within a news article is an indication of a Global North topical focus of the article; thus, Global South (GS) articles that has one or more of these terms likely satisfies the two conditions of our conjecture. If the conjecture and its consequences hold, we should see a relatively lower presence of these terms within GS-F (in comparison with GS-R). This is so since there is no opinion incentive to produce fake news targeting Global North.

Table 1.

Frequency of Global North Words/Phrases.

Word or phrase	Label indicating data subset
Word or phrase	GN-R	GN-F	GS-R	GS-F
United States	28.76%	17.72%	5.08%	1.23%
Trump	44.83%	53.34%	4.55%	2.44%
Clinton	10.17%	21.92%	0.48%	0.15%
Britain	7.16%	1.04%	1.34%	0.27%
Washington	18.19%	13.99%	3.20%	0.75%
Europe	5.92%	2.79%	2.56%	0.48%
Japan	3.66%	0.52%	2.5%	0.81%

The results from Table 1 reveal an extremely low presence of Global North topics within GS-F, indicating that fake news from Global South is unlikely to cover Global North topics. The key observation from the table is the sharp deterioration of the frequency of these words as we move from GS-R to GS-F; this supports our conjecture. The words’ frequency in GS-R ranges from around two to five times that of their frequencies in GS-F, indicating the intensity of the trend and the support for our conjecture. This may be contrasted with the observation that there is no such consistent trend across GN-F and GN-R.

Our findings are contingent on the frequentist statistics we use. Future work could delineate the trend further by leveraging more sophisticated natural language processing techniques. These could attempt to incorporate linguistic and stylistics-based theories of news authoring through modelling machinery such as generative modelling. It could be reasonably expected that incorporation of linguistic theories could enhance fake news detection in general as well. Yet, these would require highly interdisciplinary research that blend journalistic understanding with statistical modelling and remain outside the remit of this article.

Consequences for AI-based fake news detection

Our conjecture on the geo-political veracity gradient in Global South news could throw light on the frictions experienced while applying AI-based fake news detection models outside their regions of production. We outline and empirically illustrate two such effects herein, using the FNDNet (Kaliyar et al., 2020) dataset for fake news classification. These serve as coarse-grained support for our conjecture.

AI trained on Global North data applied in Global South: Consider an AI model for fake news classification trained using Global North data, being applied within Global South contexts. With the pervasive availability of pre-trained/foundation models in AI, this is a very likely scenario. The model, due to being trained on Global North data, may have internalized patterns of lexical correlations between Global North words and real/fake labels. If our conjecture holds, Global North words are much less prevalent in Global South fake news, resulting in the model being of very limited utility in that segment. Thus, such a model is likely to produce a lot of misclassifications of fake news as real (i.e., false negatives) as an illustration of its incompetence for decision-making on Global South fake news data.

Table 2 shows the results of our experiments to assess this trend, as confusion matrix. Within a confusion matrix, an accurate classifier would have high entries in the top-left and bottom-right, since they correspond to cases where the actual and predicted labels are the same. Among the incorrect decisions, the bottom-left cell corresponds to false positives (real news classified as fake) and the top-right cell corresponds to false negatives (fake news classified as real).Bold value in Table 2 confirms our expectation of the high prevalence of false negatives; also noteworthy is the fact that the other kind of errors, false positives, are almost 40% lower.

Table 2.

GN-trained AI on GS-Test.

Actual labels	Predicted labels
Actual labels	Fake	Real
Fake	455	482
Real	285	605

AI trained on Global South data applied in Global North: Let's now consider the less likely⁶ analogous case, that of taking fake news AI trained on Global South data, and applying it within Global North contexts. If our conjecture holds, the AI would likely have identified and internalized the pattern within Global South news that Global North words correlate to real labels to a high degree. In particular, it would have internalized a high propensity to choose the label real while encountering Global North words. Now, when applied in Global North contexts, this should translate to a high inclination towards real label in general, with all news in Global North contexts naturally exhibiting a high prevalence of Global North words.

Table 3 plots the confusion matrix for this case, indicating, as expected, a high propensity to choose the real label as indicated by bold values. It is interesting to see that the fake label is chosen so sparingly, indicating the intensity of the effects due to the conjecture.

Table 3.

GS-trained AI on GN-Test.

Actual labels	Predicted labels
Actual labels	Fake	Real
Fake	90	847
Real	6	884

Discussions and conclusions

Discussion: Scholarship in data-driven algorithms and AI, in general, has been facilitated by a high prevalence of benchmark datasets which have arguably played a substantial part in predicating the direction of research. The production of benchmark datasets has been increasingly seen as a service to the community while also offering high citation-gathering incentives. However, this trend promotes a one-size-fits-all ethos, which when read against the backdrop of Global North hegemony makes such benchmark datasets largely Global North-centric. This works out to codify and reinforce a neglect of Global South complexities and nuances in AI scholarship, working as a substantive pathway for injection of cultural imperialism within AI technologies. Any efforts to bring the benefits of AI to the Global South, especially when it comes to highly sensitive and consequential realms such as fake news detection, ought to be accompanied by an urge to understand, contextualize, and assimilate the unique characteristics of Global South contexts within algorithms. We do expect that there would be many more fine-grained distinctive factors at the national or regional level within the Global South, ones that may require much more investigation to uncover.

Mitigating geo-political bias in AI for fake news detection: While the intent of this paper is limited to highlighting a facet of geo-political bias in AI for fake news detection, we here deliberate briefly on potential pathways to address the challenge. First, most obviously, the presence of Global South data within benchmark datasets for the task should be enhanced, to foreground the unique complexities of culturally diverse global regions. This would entail an attention to such regional divergences within AI research to tackle fake news detection. Second, proposals such as model cards (Mitchell et al., 2019) that seek to make sure several details of the model including its training details and intended use are made public could be expanded to heed to geo-political issues. For example, regulatory efforts could seek information on the geographical origins of the datasets that have been used in training the model in the form of geo-political model cards. Such regulations would force creators of models developed for Global North or otherwise parochial settings to admit so. Monitoring and regulation of cultural influence in algorithm design is much harder to address and may need specific research in the intersection of critical cultural studies and science/technology studies.

Conclusions

In this paper, for the first time, we analyzed the impact of regional orientation and their influence on fake news detection within the Global South. In particular, we outlined a consistent pattern within Global South news, where the prevalence of Global North words is directly related to news veracity, one we denote as the geo-political veracity gradient conjecture. We provide three approaches to support this conjecture. First, we provided argumentation grounded on the political economy of media and fake news as analytical support for the conjecture. Second, we analyzed popular datasets and illustrated the empirical measurability and intensity of the pattern. Third, we illustrated the consequences of this veracity gradient in limiting the cross-regional application of AI for fake news detection. By providing these arguments, we seek to bring attention to the nuances of fake news in different regions, particularly with respect to the Global South, to achieve an equitable impact in the larger objective of tackling disinformation with AI methods.

Future work: We find that a lot of marginalized communities within the Global South are not represented as much within digital media as compared to traditional media. In future work, we intend to consider how such representational gradients would influence AI for fake news detection, and to understand the nuanced effects of diverse Global South cultural contexts within fake news detection.

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This was part of a project funded by Queen's University Belfast's Global Research Partnership Development Fund (Ref: 07/2022).

Data availability

The manuscript has no associated data. A pre-print version of this submission appears at:

ORCID iD

Deepak P

Notes

References

Braff

Nelson

(2022). The global north: Introducing the region. Gendered lives: Global Issues. Milne Publishing.

Chinmayi

(2020). AI and the global south: Designing for other worlds. In Markus

Frank Pasquale

Das

(Eds), The Oxford handbook of ethics of AI (pp. 588–606). Oxford University Press.

Crogan

Kinsley

(2012). Paying attention: Towards a critique of the attention economy. Culture Machine, 13, 1–29. https://culturemachine.net/wp-content/uploads/2019/01/463-1025-1-PB.pdf

Deepak

Bhadra

Jurek-Loughrey

Kumar

G. S.

Kumar

M. S.

(2024). Geo-political bias in fake news detection AI: the case of affect. AI & Ethics Journal, 1–6. https://doi.org/10.1007/s43681-024-00494-7

Dhawan

Bhalla

Arora

Kaushal

Kumaraguru

(2022). Fakenewsindia: A benchmark dataset of fake news incidents in India, collection methodology and impact assessment in social media. Computer Communications, 185, 130–141. https://doi.org/10.1016/j.comcom.2022.01.003

Fuchs

(2010). New imperialism: Information and media imperialism? Global Media and Communication, 6(1), 33–60. https://doi.org/10.1177/1742766510362018

Graham

Dittus

(2022). Data and inequality: Geographies of digital exclusion. Pluto Press.

Grinberg

Joseph

Friedland

Swire-Thompson

Lazer

(2019). Fake news on twitter during the 2016 us presidential election. Science, 363(6425), 374–378. https://doi.org/10.1126/science.aau2706

Hendriks Vettehen

Kleemans

(2018). Proving the obvious? What sensationalism contributes to the time spent on news video. Electronic News, 12(2), 113–127. https://doi.org/10.1177/1931243117739947

10.

Iqbal

Shahzad

Khan

S. A.

Chaudhry

M. S.

(2023). The relationship of artificial intelligence (AI) with fake news detection (FND): A systematic literature review. Global Knowledge, Memory and Communication, 1–21. https://doi.org/10.1108/GKMC-07-2023-0264

11.

Jaiswal

Ganai

Dash

Ghosh

Mukherjee

(2024). Breaking the global north stereotype: A global south-centric benchmark dataset for auditing and mitigating biases in facial recognition systems. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7, 634–646. https://doi.org/10.1609/aies.v7i1.31666

12.

Kaliyar

R. K.

Goswami

Narang

Sinha

(2020). Fndnet–a deep convolutional neural network for fake news detection. Cognitive Systems Research, 61, 32–44. https://doi.org/10.1016/j.cogsys.2019.12.005

13.

Marx

(1857/1993). Grundrisse: Foundations of the critique of political economy. Penguin UK.

14.

Mitchell

Zaldivar

Barnes

Vasserman

Hutchinson

Spitzer

Raji

Gebru

(2019). Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 220–229). Association for Computing Machinery. https://doi.org/10.1145/3287560.3287596.

15.

Özkula

S. M.

Reilly

P. J.

(2024). Where is the Global South? Northern visibilities in digital activism research. Social Media + Society, 10(4), 20563051241299835. https://doi.org/10.1177/20563051241299835

16.

Pope

Hassanuzzaman

Sherpa

Emara

Joshi

Adhikari

(2024). Skin cancer machine learning model tone bias. arXiv preprint arXiv:2410.06385. https://doi.org/10.48550/arXiv.2410.06385

17.

Rambaldi

(2022). A review of the development divide between global north and south through a foucauldian perspective: The divide: A brief guide to global inequality and its solutions, by J. Hickel, London, William Heinemann, 2017,£ 23.66 (hardcover), isbn: 9781785151125. Development Studies Research, 9(1), 67–69. https://doi.org/10.1080/21665095.2022.2042348

18.

Seib

(2005). Hegemonic no more: Western media, the rise of al-jazeera, and the influence of diverse voices. International Studies Review, 7(4), 601–615. https://doi.org/10.1111/j.1468-2486.2005.00535.x

News about Global North Considered Truthful! The Geo-Political Veracity Gradient in Global South News