Abstract
We describe a novel computational dictionary for the study of right-wing populist conspiracy discourse (RPC) on the internet, specifically in the context of contemporary German politics. After first presenting our definition of conspiracy discourse and grounding it in antecedent research on mediated rhetoric at the intersection of right-wing populism and conspiracy theory, we proceed by outlining our approach to dictionary construction, relying on a combination of manual and automated methods. We validate our dictionary via parallel manual coding of 2,500 sentences using the categories contained in the dictionary as labels and compare the consensus result with the label assigned to each sentence by the dictionary, achieving satisfactory results. We then test our approach on two different datasets composed of alternative news articles and Facebook comments that spread conspiracy theories. Finally, we summarize our observations both on the methodological premises of the approach and on the object of populist right-wing conspiracy discourse and its dynamics more broadly. We close with an outlook on the potentials and limitations of the dictionary-based approach and future directions in applications of content analysis to the study of conspiracy discourse.
Keywords
Introduction
Conspiracy theories are frequently deployed by fringe political actors that aim to undermine political institutions and boost their own claims to legitimacy. Examples on the right include individual politicians, parties and movements (Engesser et al., 2017; Gerbaudo, 2018; Puschmann et al., 2020). Conspiracy theories relating to controversial political issues in particular appear to activate supporters of both right-wing and left-wing causes by tapping into strong emotions, such as alienation, fear and resentment (Hameleers, 2021). Indeed, conspiracy theories serve as a common denominator to ideologically disparate populist strands (Bergmann and Butter, 2020), while the intersection with right-wing populism is especially relevant to the German context (Dostal, 2015; Lees, 2018; Vorländer et al., 2018).
It therefore seems desirable to develop computational tools for the detection and measurement of the prevalence of such ideas in online environments (e.g. Gründl, 2020, for right-wing populism). Such tools are able to utilize the entanglement of conspiracy theories and language, that is, identify a conspiratorial vocabulary and the specific linguistic markers that signal an affinity for conspiratorial thinking. While the term conspiracy theory is extensively used in both societal discourse and academic literature, it has been widely pointed out that conspiracy theories, while often practicing a mimicry of science, do not generally take a form that would hold up to scientific standards and in most cases hardly constitute theories. Alternative terms such as ‘myth’ or ‘narrative’, that stress the affinities to religious belief systems or questions of form and narrative representation, have accordingly been proposed. We use the term conspiracy discourse in order to emphasize the communicative nature of conspiracy theories, the fact that, online in particular, conspiratorial ideas are told to be spread and developed, and that specific linguistic properties, from vocabulary to certain stylistic markers, reliably signal conspiracy communication.
Our paper describes RPC-Lex, a computational dictionary to measure the composition of right-wing populist conspiracy discourse (RPC) in German-language texts. 1 By applying our dictionary to Facebook comments and alternative news media articles, we provide a cross-platform perspective within which we put our conceptual frame of RPC discourse to the test. As digital environments in general and social media in particular do not primarily cater to single conspiracy theories but are prone to the merging of conspiracy and other discourses, the investigation of common tropes of conspiracy thinking and right-wing populism offers distinct advantages, taking into consideration content as well as linguistic elements. Especially because of a recent flurry of research on both conspiracy theories and populism, a study of how different dimensions of these discourses intertwine is relevant for further theoretical and conceptual advancements. RPC-Lex as a resource is intended to further the study of new media technologies in a cross-platform perspective, applying an original conceptual and theoretical approach in order to study RPC-discursive elements and particular combinations in which they occur.
We first detail our theoretical approach and its implications for dictionary design by individually examining elements of conspiracy discourse and elements of right-wing populism and discussing the connection of both. These theoretical considerations then form the basis of the 13 categories of our dictionary, ranging from
Right-wing populist conspiracy discourse and the German context
Elements of conspiracy discourse
Research on conspiracy theories must contend with the difficulty of finding a clear-cut definition of the term. This proves challenging, first, because conspiracy is not a neutral term, but often used pejoratively to delegitimize others in public or private discussions. Second, because conspiracies do actually exist, and third, because, unlike the phenomenon itself, the term conspiracy theory is relatively new and has only been coined in the 1970s (Butter and Knight, 2020: 3–4). As Andrew McKenzie-McHarg noted, there is ‘a vagueness that inheres to conspiracy theory as a concept’, which opposes a simple and universally valid definition (2020: 16). Within this vagueness, still, there are elements common to conspiracy belief that are widely accepted in the field. The three basic characteristics of most conspiracy theories, as they have been coined by Michael Barkun, are that (1) ‘Nothing happens by accident’, (2) ‘Nothing is as it seems’ and (3) ‘Everything is connected’ (2013: 3–4). These premises are embedded in a broader mechanistic understanding of history as intentionally and secretly determined by a (small) powerful group of conspirators, able to control the course of events over long periods of time (Butter, 2020: 21). This worldview goes hand in hand with the underlying Manichean dualism between good and evil found in conspiracy theories (Cubitt, 1989: 15). Within this framework, a typology of conspiracy theories differentiates, first, ‘top-down’ from ‘bottom-up’ conspiracies, depending on where in the (national/social/political) hierarchy the imagined conspirators are located (e.g. the political elites as conspirators vs., for example, dissidents, socialists or Jews as conspirators). Second, ‘internal’ conspiracies are distinguished from ‘external’ ones, depending on whether the imagined conspirators are situated inside or outside of a nation’s institutions (e.g. leading politicians as internal conspirators vs. foreign intelligence services or terrorist organizations as external conspirators). While historically speaking bottom-up conspiracy theories have been very popular, in recent years, ‘there has been a growing tendency in the Western world to identify internal and top-down conspiracies’ (Butter, 2020: 14–16).
Overview of RPC-Lex categories with minimal definitions and numbers of entries.
Elementary, the bolstering of an in- and out-group distinction prevalent in conspiracy discourse often takes the form of
Closely linked to this anti-elitist dimension of conspiracy discourse are several linguistic categories. First, a vocabulary of
Furthermore, a separate category
Finally, a dimension that is not only conceptually related to conspiracist thought but is arguably situated at the structural core of most conspiracy theories is
Many of the elements of conspiracy discourse discussed in this section are also subjects of populism research. How exactly the two concepts are related and what this means for an analysis of conspiracy discourse in the context of right-wing populist discourse in Germany is examined in the following section.
Linking conspiracy discourse and right-wing populism
The questions and issues discussed in research on conspiracy theories overlap substantially with those discussed in populism research, which justifies further interrogation of the connection between the two phenomena (Bergmann and Butter, 2020: 332–333). Populism is studied in a variety of disciplinary and interdisciplinary approaches. For all their differences in how they define populism exactly, three core elements can be identified throughout the research: (1) the reference to ‘the people’, (2) anti-elite sentiments and (3) a conception of ‘the people’ as homogeneous, monolithic, with strong exclusion strategies toward ‘Others’ or ‘the Other’ (Woods, 2014: 3). Expressed in terms of an ideational approach, populism can be defined as a ‘thin-centered ideology that considers society to be ultimately separated into two homogeneous and antagonistic camps, ‘the pure people’ versus ‘the corrupt elite’, and which argues that politics should be an expression of the volonté générale (general will) of the people’ (Mudde and Rovira Kaltwasser, 2017: 6 [italics in the original]). This ‘thin-centered’ ideational core consequently links populism to other – ‘thick-centered’ – ideologies, such as fascism, nationalism or socialism (Mudde and Rovira Kaltwasser, 2017: 6). Moffit and Tormey conceive of populism as a ‘political style’ (2014) capturing this ideational core, which enables a language-based study of populist discursive practices corresponding to the study of conspiracy discourse outlined above, as ‘style and substance are thus interlinked in populist politics’ (Bergmann and Butter, 2020: 332).
However, even as research on conspiracy theories often mentions their possible function for populist or authoritarian politics and vice versa, there is a lack of research studying the connection explicitly. Bergmann (2018) establishes that, while conspiracy theories ‘can be tailored to any political view’ and populism can take different forms, ‘the two unite as an especially powerful force within the field of the nationalist far-right’. (Bergmann, 2018: 105; see also Byford, 2011; Priester, 2012; Dreesen, 2019). As Wodak explains with a focus on the instrumentalization of fears: ‘Conspiracies by enemies within and outside the nation are part and parcel of the discursive construction of fear by far-right populists’. (Wodak, 2021: 94 [italics in the original]).
Focusing on RPC discourse thus calls for an expansion of the category system detailed above. The following elements are neither exclusive to right-wing populist discourse nor to conspiracy discourse, but relevant dimensions for conspiracy discourse within right-wing populist contexts. Because of its ‘thin-centeredness’, populism is adaptable to various subjects. However, the relatively stable core elements of populism (anti-elitism, appeal to a monolithic ‘people’ by simultaneous exclusion of ‘Others’) can arguably be found in a lot of opportunistic right-wing slogans of recent times, such as ‘Klimahysterie’, ‘Ökoterror’, ‘Gendergaga’ or, as of late, ‘Corona-Diktatur’. While not depicting any of these topics specifically nor ‘populism’ in general, the categories of RPC-Lex are designed to illuminate a certain kind of populist discourse, namely at the intersection of right-wing populist and conspiracy thinking. Therefore, most categories have relevance even for the study of RPC discourses concerning changing topics not explicitly included in the conception of RPC-Lex.
While
An essential dimension of such RPC narratives is the topos of
In addition to these interconnected categories,
Finally, there is another conglomerate worth investigating as a category of RPC discourse, namely
These theoretical considerations show the strong structural connection between conspiracy and right-wing populist discourse. Combining the two terminologically into RPC discourse acknowledges these overlaps and allows to study interrelated elements of both discourses. While neither conspiracy nor right-wing populist language consist of absolutely fixed linguistic elements, relatively stable features of both discourses can be discerned (see Table 1 for an overview of all RPC-Lex categories introduced above, grouped according to the three dimensions ‘style’, ‘antagonists’ and ‘topoi’). Precisely because the ‘particular combination’ of these ‘psychological, structural and functional qualities […] can vary greatly according to context’ (Butter and Knight, 2020: 3), observing such combinations of conspiracy discourse elements in their intersection with elements of right-wing populist discourse constitutes the main concern of this paper and the focus of the RPC-Lex dictionary. Right-wing populism does not necessitate a belief in conspiracy theories or the participation in their spread, and vice versa. However, through their structural similarities, synergy effects can aid a broader support for movements that would ordinarily find themselves at the fringes of social and political discourse.
The evolution of right-wing populism in Germany
‘There is no threat to Western democracies today comparable to the rise of right-wing populism’, writes Mackert (2019: 1). While the rise of right-wing populism has been observed since the 1990s at least, more recent developments confirm its advance, usually linked to a profound sense of disenchantment with the democratic processes and institutions perceived as having failed and alienated ‘the people’ (Vorländer et al., 2018: 169). This is often illustrated by such issues as the Brexit vote, the election of Donald Trump or election successes of right-wing populist and extreme right or authoritarian parties across Europe, for example in France, Austria, Germany, Poland, Hungary and Italy. Germany as ‘Europe’s most influential member state’ (Lees, 2018: 299) holds an especially critical place on the European political plane. To exemplify the strengthening of right-wing populism in Germany, it is helpful to consider the two largest movements of recent years, namely the political party Alternative for Germany (AfD) and the association of ‘Patriotic Europeans against the Islamization of the Occident’ (Pegida: ‘Patriotische Europäer gegen die Islamisierung des Abendlandes’).
Pegida was founded in 2014 in the East German town of Dresden, state capital of Saxony. First only a private Facebook group, the group soon organized regular ‘demonstration walks’ through Dresden city on Mondays. The ‘refugee crisis’ of 2015 brought Pegida renewed publicity and the movement gained momentum in other parts of Germany (Vorländer et al., 2018: 2–5). Its conspiracist core is suggested in the name of the movement, namely the fear of the ‘Great Replacement’ through uncontrolled immigration and Islamization. As the name of the group clearly states, the main conditions for belonging to ‘the people’ are ethnicity and religion. This right-wing ideology is supported by the chanting of ‘lying press’ (‘Lügenpresse’) at Pegida demonstrations (Nachtwey, 2016: 135). Typical for the new forms of right-wing populism is the suggestion that there is no real freedom of speech and of opinion because ‘the establishment’ is supposedly forcing its agenda on citizens through the ‘system press,’ hence necessitating civil protest (Gadinger, 2019: 130). Pegida thus combines Islamophobic and anti-immigrant fears with a general anti-elitism including the ‘mainstream media’ (Dostal, 2015: 523).
While Pegida remains a non-parliamentary movement, which means it can employ a more radical anti-democratic rhetoric, the AfD, founded in 2013, took the parliamentary route in their efforts to challenge the political system. Some argue that it can be viewed as the parliamentary arm of the Pegida movement (Dostal, 2015: 523). Following their success at the 2017 Federal election, the AfD became the first right-wing party that made it to the German and European parliament since the 1950s, even becoming the third largest party grouping in the German Bundestag (Lees, 2018: 295–296; see also, for the study of the AfD’s right-wing populism Häusler, 2016; Wildt, 2017). Similarly to Pegida, social media presence plays an important role in the AfDs success and presence (Serrano et al., 2019). The self-positioning of the AfD as a protest party constitutes a major pull factor for supporters who are disillusioned by a political system they do no longer perceive as representing, let alone benefiting them, thus a political system lacking legitimacy, finding themselves in a crisis of representation (Nachtwey and Heumann, 2019: 439). In turn, the vote for the AfD is itself perceived as an act of protest against a corrupted system (Nachtwey and Heumann, 2019: 451).
To summarize, there has been a considerable growth of right-wing populist protests, movements and parties in Germany, which substantially base their raison d’être on conspiracy theories such as the ‘Great Replacement’ while making vast use of online communication and social media (Puschmann et al., 2020). Germany is one of the most influential political and economic players in Europe and is home to a stack of right-wing (populist) movements that have gained momentum over the course of the past decades. A more detailed understanding of how right-wing populist ideas feed off and advance conspiracy thinking in the German context is not only interesting from a communication scientific viewpoint, but is also politically relevant. In order to study this development in more detail, we first constructed and then validated our computational dictionary, a process that we describe in detail in the following section. While focusing on German content only, studying different corpora (Facebook data and alternative news media articles) provides a cross-platform perspective within which we apply our conceptual frame of RPC discourse.
Dictionary construction and validation
This paper employs two sets of methods for dictionary design drawn from the toolkit of the text-as-data approach increasingly popular in media and communication research as well as neighboring fields such as political science and sociology (Boumans and Trilling, 2016; Grimmer and Stewart, 2013; Maerz and Puschmann, 2020; Welbers et al., 2017). The first set of methods is used to create the dictionary and combines theoretical considerations and manual annotation with automated procedures for expanding, cleaning and optimizing the dictionary. The second set of tools is used to apply the dictionary and interpret the results of this application in order to demonstrate the dictionary’s usefulness for research purposes.
Based on the quanteda package (Benoit et al., 2018), the main application of a dictionary is to count the number of words in each message (or alternatively the entire corpus) that are also contained in a specific dictionary category. Relative shares of each dictionary category can thus be calculated, either on the basis of overall words in a corpus or by assigning each text (or paragraph or sentence) in the corpus a label based on which category of the dictionary receives the most hits. These shares can then be compared along an additional variable – typically source, author, political leaning on one side, or time on another – to identify differences and gauge shifts over time. Finally, the results can be subjected to both statistical tests and used as a model input, for example in regression analysis (Welbers et al., 2017).
Computational dictionaries represent one – today quite traditional – set of tools to study textual data quantitatively (Young and Soroka, 2012). While their ease of use represents an advantage, dictionaries have been criticized for being less adaptable to different types of data and for a lower reliability over newer approaches, particularly supervised machine learning, especially when they are imperfectly adjusted to the data at hand or when the aim is to assign texts discrete categories, as is the case in standardized content analysis (Chan et al., 2021; González-Bailón and Paltoglou, 2015). The significant limitations that apply when comparing dictionary performance to human coding combined with supervised machine learning necessitate stringent validation of any dictionary before application. Baseline validation of RPC-Lex was carried out via human labeling of sentences against the codes assigned by the dictionary based on majority voting with all limitations that this approach engenders (Barberá et al., 2021). While the accompanying release of the dictionary has provisions to make the use as a classifier feasible in principle, we caution users to carefully inspect the material they apply the dictionary to in order to safeguard against errors and enclose performance metrics to enable users to make informed choices when applying the dictionary to their own material (see Chan et al., 2021; Song, 2020), for suggestions on how to validate results in such scenarios).
Construction
There is no single accepted method to construct computational dictionaries. As dictionary analyses are ‘more deductive in nature and presuppose very detailed domain knowledge’ (Maerz and Puschmann, 2020: 44), the RPC-Lex dictionary categories as well as the terms chosen to populate them were developed based on the theoretical foundation illustrated above. In this way, the terms chosen for the dictionary can be understood as indicators for the theoretical concepts they are supposed to measure (Gründl, 2020: 6). At various stages of the process, inductive and explorative loops were performed based on the material described below (see Figure 1). Methodological loop in dictionary construction and application.
In an undergraduate research seminar on conspiracy theories taught in the fall of 2019, students were asked to compile word lists on the basis of theoretical texts (Butter, 2018; Detering, 2019; Horn and Rabinbach, 2008; Hunger, 2016; Krause, 2011; Melley, 2000; Strässle, 2019), individual international and German-language case studies (The Protocols of the Elders of Zion, Compact-Magazin, KenFM, Alex Jones, Daniele Ganser, Eva Hermann) and research on (media) events occurring in the period of time our corpora cover (i.e. the ‘refugee crisis’ or the ‘Cologne New Year’s Eve, 2015/2016’). The result was an uncategorized collection of 1512 entries (1315 unigrams, 197 n-grams) and a list of abstract language features (e.g. quotation marks).
In order to arrive at a dictionary that is extensive enough to provide a good recall yet exact enough to be precise (Gründl, 2020: 6), the authors manually cleaned and expanded this first word list, and arranged it into categories in an extensive iterative process. First, a literature review on both conspiracy discourse and right-wing populism was conducted. The deductive category framework derived from these theoretical considerations passed through different stages, re-evaluating the suitability and applicability of categories for the corpora described below. Second, while the rationale behind individual categories is outlined in the previous sections, literature was also consulted to expand the initial word list and thus to populate the categories.
For the linguistic category of
The category of
The category of
Taking into consideration the difficulties of clearly defining
Following the development of an initial seed dictionary based on this review of the relevant literature, we next calculated the occurrences of the seed terms in a reference corpus consisting of comments in German-language right-wing Facebook groups, particularly those associated with Pegida and the AfD, spanning the period from 1 January 2015 to 24 May 2016 (see Puschmann et al., 2020, for details on this corpus). Co-occurrence and KWIC (keyword-in-context) searches as well as spot checks on qualitative samples for all categories were carried out using the reference corpus. We also relied on computational word similarity metrics to identify terms that occurred in conjunction with our seed terms in the reference corpus. In addition, further websites spreading conspiratorial and/or right-wing content were searched (e.g. the sites WikiMANNia and Metapedia, both modeled after Wikipedia). This latter step delivered a number of codes, relevant predominantly to the categories of
Finally, for all categories synonyms of verbs, nouns and adjectives were generated using the Wortschatz Leipzig online resource, Dornseiff’s Der deutsche Wortschatz nach Sachgruppen (2004), Harras’ et al. Handbuch deutscher Kommunikationsverben (2004) and the online resource of the DWDS. Asterisks were used in a first automated step to gather all relevant variants within the corpora, and then expanded to identify a large number of additional derivative forms that were added to the dictionary. For a German language dictionary especially, using regular expressions to capture the most used grammatical variations returns higher precision than using asterisks or stemming (Gründl, 2020: 8).
The enriched dictionary was once again cleaned manually, removing most function words, highly polysemic and high-frequency nouns, adjectives and verbs as well as terms occurring in more than three categories. Unclear terms, URLs, hashtags and other faulty entries were removed. Further flection forms were added where necessary, to arrive at a wildcard-free dictionary. The result was a global dictionary with 14,105 entries (including those occurring in multiple categories). To aid the process of validation, the categorization and relevance of terms was once more critically evaluated by all authors. The dictionary was thus reduced once more following the same multi-person procedure to ensure accuracy and checked against new and relevant publications to ensure a high recall. The final dictionary consists of 10,829 unique entries, distributed over 13 categories.
A complicating factor that arises when applying computational dictionaries is that outcomes are influenced by baseline word frequency in a language, that is, certain words included in the dictionary may be far more likely to occur than others, and if the distribution of such words differs between categories this could adversely influence results (see e.g. Rauh, 2018). To provide potential users of the dictionary with a resource to counter this problem, an additional step toward enrichment was taken. We matched the terms contained within RPC-Lex with the DeReKo reference corpus for German (Kupietz et al., 2018), specifically with the most recent release of the DeReWo frequency-annotated word list, which provides valid base frequencies of the 100,000 most frequent terms in German. This is helpful because it allows users of RPC-Lex to weigh the occurrence of terms in their data against an expected base frequency to judge how indicative they are of a specific topic. For example, a single occurrence of the term Polizeistaat (police state; base frequency of 3760) can be regarded as more indicative of RPC than an occurrence of Freiheit (freedom; base frequency of 466,307). By applying the base frequency as a weighting factor or excluding highly frequent words entirely when applying the dictionary, users are able to further improve the validity of their approach.
Validation
Before formal validation of the dictionary was undertaken, we conducted a comparison with another dictionary recently developed for the study of right-wing populism by Gründl (2020) in order to characterize the degree to which the two resources describe similar concepts. This step does not represent a validation, but instead allows dictionary users to better evaluate the relative suitability of both resources to particular research questions they may be interested in. It should be pointed out that we did not base our dictionary on Gründl (2020), which was published when dictionary construction of RPC-Lex was well under way. Our aim was to determine whether substantial overlap exists between the terms incorporated into the two dictionaries, particularly for those categories assumed to be strongly influenced by right-wing populist concepts, rather than those categories in our dictionary that go beyond the categories captured in the Gründl dictionary. We achieved this by calculating the percentage share of terms in RPC-Lex also contained in Gründl (2020), differentiating by category. This results in an overlap of terms between the two resources that ranges from 38% for
Following this preliminary step, we compared the classification of texts via the dictionary to the judgement of human coders. Two student assistants were first provided with a basic code book describing the 13 categories in the dictionary along with a set of anchor example sentences (see our OSF repository for further details). An in-depth discussion of the examples among the coders and two of the authors was also conducted to clear up open questions on the composition of the categories. No in-depth coder training or formal pretest was conducted in order to determine the reliability with which the categories could be consistently coded with only minimal instructions. In the next step, the two coders independently labeled 2,500 sentences randomly sampled from a corpus of comments posted to 25 RPC Facebook pages using the dictionary categories as labels. Of these, 2,494 could be retained for further analysis, with six discarded due to technical error. The coders achieved agreement in independently selecting the same category in only 50% of coded sentences, with considerable variation between categories (see Online Appendix 1). The 1,251 consensus cases where both coders chose the same category were very unevenly distributed between the 13 categories, with some categories occurring only a few times in the data. The appendix contains an overview of the consensus categories’ distribution.
Model coefficients per dictionary category.
It should be noted that this approach is hardly suitable to distinguish between RPC and non-RPC content as a result of the broad coverage of the dictionary. Using the dictionary to classify texts unequivocally requires human validation of a random sample of texts, including cases which do not match any category (see Chan et al., 2021). It is also worth pointing out that the dictionary was developed in a theory-led process, rather than having been explicitly designed for the material that we then applied it to in this step, with the validation tentatively suggesting that using it to choose among categories may yield satisfactory results under the right circumstances. However, the uneven category distribution is a considerable limitation, with the categories
Having taken these steps to safeguard the validity of the computational resource, we provide an application of the RPC-Lex dictionary to online discourse at the intersection of right-wing populism and conspiracy theory in the following section.
Application to two use cases
Overview of corpora used in analysis.
The Alternative News data were collected in 2020 and 2021 and covers the period from 2017 to 2019. In the case of the Facebook dataset, a somewhat longer period, from 2012 to 2019, is covered. Both the Facebook and Alternative News datasets were collected via web scraping. Both corpora encompass discourses that are thematically related to the RPC narratives that form the conceptual core of the dictionary – in other words, they contain communication that should be picked up by the dictionary as relevant to the issue under study.
Alternative news corpus
We applied the RPC-Lex dictionary to a full text corpus of news items from nine alternative news outlets discussed in the research literature on alternative news and conspiracy theories. Our understanding of alternative news outlets is based on the typology of Holt et al. (2019) who describe online news sources that exhibit a politically radical editorial policy and frequently circumvent journalistic norms. We base our selection on recently published studies of alternative right-wing news that provided reasoned lists of popular and influential news outlets (Boberg et al., 2020; Frischlich et al., 2020; Heft et al., 2019). Our selection of sources was furthermore influenced by two additional aspects. First, a substantial visibility in terms of Facebook engagement data, collected for another project, a parameter that added sites to the list that are not necessarily widely read independent of Facebook usage. Second, outlets being actively listed by German domestic intelligence (Bundesverfassungsschutz) as a potential danger to democracy (i.e. associated with violent threats), which applied only to a subset of sources. This resulted in a total of nine outlets of varying reach and visibility. We then scraped all URLs obtained from a particular source that had been shared on Facebook between January 2017 and December 2019 and applied the RPC-Lex dictionary to the data.
The result is a detailed profile of German right-wing alternative news sources covering specific aspects of RPC discourse, such as RPC-Lex categories in the Alternative News Corpus by source.
It is possible to characterize the sources in terms of their distributional characteristics in relation to particular clusters of categories, for example, sources that score high in
Crucially, such differences tend to align with differences in the type of news source, for example, RT comparatively deemphasizes
Facebook corpus
The main aim of the application of RPC-Lex to the Facebook corpus is to show the ebb and flow of RPC discourse over time. The Facebook corpus was created as part of investigative reporting conducted by German public service broadcaster BR into the prevalence of right-wing hate speech in German-language Facebook groups. Reporters created fake Facebook profiles and thus gained access to a large number of non-public right-wing Facebook groups. Posts and comments from these groups were subsequently scraped, resulting in an archive that reaches back to 2010, though sample sizes were considered too small in the first several years for use in our analysis.
Figure 3 shows how the distribution of dictionary categories in the data changes over time in the period from 2015 to 2019. It is important to keep in mind that the basis is in this case a set of Facebook comments made by a large number of extremist right-wing Facebook groups over this time span. This contrasts with the Alternative News dataset in which the bases are news articles, which are generally longer and stylistically quite different from comments. As before, the variable of interest, in this case time, is shown on the X axis while the Y axis shows the 13 different categories and their percentage shares, here computed as number of terms matched with the respective category, rather than share of messages. As before, these shares differ considerably. RPC-Lex categories in the Facebook corpus over time.
First, there is a set of categories that decreases over time. For example
Some of these fluctuations match up with expectations more clearly than others. For example, the clear long-term increases in categories such as
Perhaps, one of the most surprising developments is the lack of clearly visible growth in the
Discussion
In this paper, we have first described the concept of RPC discourse and then operationalized this concept by means of a computational dictionary. We have argued that (a) a meaningful nexus exists between conspiracy theory and right-wing populism and (b) that a computational dictionary represents a suitable resource for the comparative study of these mutually intertwined political phenomena. After having outlined the composition of our dictionary and taken steps toward validation, we have applied it to two large-scale corpora of online RPC content. This application has shown both the strengths and the weaknesses of our approach. In what follows, we will discuss first the limitations and then the potentials and future directions of the computational approach that we have presented.
Before we proceed in this direction, however, it is necessary to spell out why the concept of RPC discourse advances the field of political communication. As we have outlined, there exist significant affinities between right-wing populism and conspiracy theories that have been previously recognized but not (sufficiently) studied. A key reason for this mutual affinity lies in the antagonistic worldview articulated by right-wing populism. Powerful forces within and without a national political sphere are imagined to steer public opinion and make decisions to fundamentally alter society against the will of ‘the people’. These tendencies are clearly visible in the data we have analyzed for demonstration purposes. It also appears that certain types of media are closer to the style of alternative right-wing news than others (e.g. welt.de; cf. Puschmann et al., 2016). It is important to point out that the corpora do not fully cover the period of the COVID-19 pandemic, which is likely to be the explanation for the moderate growth of the
As is to be expected, there are also considerable limitations to our approach. These are partly related to the weaknesses of the bag-of-words approach in computational content analysis and partly a result of the conceptual difficulties of properly delineating different discrete categories describing discourse at the intersection of conspiracy theory and right-wing populism (Chan et al., 2021). While this problem arguably exists in computational content analysis generally and applies to any computational dictionary, it is exacerbated when the objects of study are as fluid and diverse as right-wing populism and conspiracy theory. Word embeddings in particular hold great promise in the context for improving computational dictionaries (Rodriguez and Spirling, 2022; Rudkowsky et al., 2018).
Crucially, RPC-Lex should be used only after careful validation as an RPC classifier and only on material that is considered RPC, rather than on a mix of RPC and non-RPC content. Our objective was to create a resource that distinguishes different styles, themes and antagonistic relationships within RPC discourse, rather than reliably determine whether a piece of content should be classified as RPC or not. This is partly based on our own assumptions regarding the use of the dictionary (to identify categories within RPC discourse) and partly due to the way in which the quanteda package, on which RPC-Lex is based, applies dictionaries.
It should also be noted that RPC-Lex does not measure ‘populism’, nor can its categories be defined as ‘populist topics’. As pointed out in the literature review, populism’s ‘thin-centeredness’ makes it opportunistic, adaptable to various subjects and lets right-wing populist actors invent new slogans or coded terms. Our focus is instead on relatively stable core elements of populism (anti-elitism, appeal to ‘the people’, exclusion of ‘others’) that make it possible to measure populist discourse independent of one specific topic. The usefulness of our dictionary to contexts other than the ones presented here must be critically evaluated. Dictionaries used to study policy areas such as security, environmental issues or healthcare can capitalize on a specialized lexis consisting of technical terms and jargon, use of which reliably signals the appearance of a certain category from within the dictionary – arguably an advantage over RPC. While the addition of topics into a dictionary like RPC-Lex follows inevitably difficult design decisions (Chan et al., 2021), we do claim a certain theoretical validity as safeguard against too severe contingency regarding our topic choices. This is achieved by a theoretically supported category conceptualization as a base for RPC-Lex. This is not to say that other topics are decidedly unfitting, or that our list of categories is exhaustive.
As we have described in the section on the composition of the dictionary, we have first qualitatively compiled a list of terms on the basis of the academic literature on right-wing populism and conspiracy theories and then drawn upon quantitative techniques to extend, revise and improve upon these seed terms. However, every corpus is different and, when applying the dictionary, it is of the utmost importance to carefully check the validity of the dictionary on the material under study via human coding, especially when the corpus is based on digital discourse such as social media or news texts. This is particularly important for categories such as
Considering these drawbacks, it is important to point out the specific advantages of the dictionary-based approach. As we have sought to demonstrate in our application of the dictionary, a key benefit is the ability to contrast the relationship of different categories to each other, in other words, to identify how strongly different tendencies within RPC discourse are expressed. This is not an end in itself. Contrasting category distributions truly becomes interesting when introducing a covariable such as the source of a news item (or the political leaning of that source), the poster of a social media message or a point in time, because then distributional differences among these covariables become visible, revealing structural differences between them (Klein et al., 2019). When applying such an approach as we have sought to demonstrate, the strengths of the dictionary are maximized because the results of the analysis no longer depend on individual isolated cases. While humans excel at close reading and in the actual interpretation of a piece of text, the computer is able to identify large-scale distributional differences in discourse that a human would not recognize. Assuming that the categories are well operationalized, that they fit with the data and that assigning a category to a piece of discourse is in fact possible on the basis of word usage (which it is not in all cases), this technique is accurate, reliable and highly scalable.
In addition to the quantitative use that we have outlined, it is also possible to combine a computational dictionary with qualitative methods. The most direct way of doing this is by using the dictionary categories only to identify relevant pieces of text which are subsequently read by a human. The ability to disentangle relevant pieces of discourse from a large social media corpus in this fashion can be of great advantage.
In closing, we would like to make three suggestions regarding the future development of both this resource and similar ones to benefit the field of interdisciplinary research into conspiracy theories. First, it would be beneficial to apply the same conceptual structure as we have used to other languages and other political discourses, both for the results themselves but also to improve the category system and make it more generalizable. We are convinced that the outlined structure translates – with certain limitations – to other political and linguistic environments. Second, a dictionary on conspiracy theories such as ours should be updated on a regular basis in order to capture recent developments such as conspiracy theories surrounding the COVID-19 pandemic. Third, we see further potential in our explorative procedure of integrating qualitative approaches and close reading into the development of the dictionary, for example, by consulting sources such as books, magazines or web pages that are influential in communities invested into conspiracy theories.
Supplemental Material
Supplemental Material - RPC-Lex: A dictionary to measure German right-wing populist conspiracy discourse online
Supplemental Material for RPC-Lex: A dictionary to measure German right-wing populist conspiracy discourse online by Cornelius Puschmann, Hevin Karakurt, Carolin Amlinger, Nicola Gess and Oliver Nachtwey in Convergence
Footnotes
Acknowledgements
The authors would like to thank Lea Liese, project associate, for her work in the early stages of dictionary composition, and the student research assistants Anna Fischer, Silvan Bolliger, Nicola Peters and Yingying Lee for their invaluable support in data analysis and dictionary validation.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Halbwahrheiten (Nicola Gess)).
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
