Abstract
This paper examines the meaning of violence in contemporary Western societies. Scholars have argued that in contemporary Western societies, the concept is expanding toward a broader understanding of violence, beyond its “traditional” usage in the context of crime and war. The current paper aims to generate empirical evidence that speaks to this question. We take the Netherlands as a case study and apply machine learning techniques to discourse on violence in 80,000 articles published in national newspapers between 2012 and 2021. Results show that the public discourse on violence in the Netherlands has a component that can be described as the familiar or “traditional” usage of the term violence—referring to violent political conflicts, or interpersonal violence such as child abuse. Beyond this, the term violence is associated with discourse on societal challenges. It appears in discourse on social media, political polarization, and social injustice faced by ethnic minorities, women, and the LGBT+ community. The later stages of the analysis demonstrated that the terms associated with social injustice in particular (e.g., “racist”) become more closely associated with the concept of violence over time. In short, our findings support the notion that the collective understanding of violence in the Netherlands is developing toward a broader understanding of violence, beyond the context of crime and war. Specifically, in recent years the term violence is increasingly used in association with issues of social injustice.
Violence is a topic that generates great concern and strong emotions, among academics and policy makers, but also among the general public. When we talk about violence, many will have an intuitive sense of what this means: violence commonly refers to events where an offender inflicts physical harm on a victim with some degree of intent (World Health Organization, 1996), often in the context of crime, or political conflicts like war. However, this core is surrounded by a gray area of ambiguity around what incidents and acts count as “violence.” At least in Western societies, it seems that the meaning of the term violence has expanded, and has taken on a broader meaning beyond the traditional understanding. In contemporary Western societies, violence can describe a whole range of events, feelings, and forms of harm (Stanko, 2003). For instance, in the aftermath of the killing of George Floyd and in the context of the Black Lives Matter movement, variations of the slogan “White silence is violence” became popular among protesters (Farooqi, 2020; see e.g., Honeycutt, 2020), to indicate that even seemingly innocuous behaviors (“silence”) can contribute to significant harm and oppression, and should be characterized as a form of violence. Here we see the term violence used beyond its “traditional” meaning. In this study, we aim to generate empirical evidence that can speak to this question of how “violence” is understood in contemporary Western societies.
The question of how the phenomenon of violence is understood is relevant for several key reasons. First, as noted above, violence generates great concern and interest. A better understanding of how the public engages with the phenomenon of violence can help to nuance public debates characterized by strong opinions and emotions. Further, the term violence has strong negative and problematic connotations, it indicates some act that is beyond the bounds of acceptability. As such, societal attitudes to violence reflect beliefs about the nature of social relationships: how we treat each other, what behaviors are acceptable, and how we respond when someone transgresses (e.g., Bohannan, 1960; Given, 1977). As such, the public understanding of violence has academic relevance across the social sciences. Here, we aim to gather evidence for the idea that—at least in contemporary Western societies—the collective understanding of what violence entails has expanded beyond its traditional meaning.
There are three primary arguments to support the notion that the concept of violence has expanded. First, many concepts in the socio-cultural sphere (including violence) are socially constructed (De Haan, 2008; Lauwaert, 2019) which is to say that the meaning of “violence” is not fixed, but negotiated in interactions between different social actors. That is, our understanding of concepts such as violence changes continuously because of evolving cultural norms and priorities. While many concepts are constructed in this way, there is reason to believe that the concept of violence is under particular pressure. Violence is an essentially contested concept (e.g., de Haan, 2008): concepts that are particularly likely to produce disagreements surrounding their meaning. Part of the reason why the meaning of the term violence is so contested, is that the term violence is strongly negative, it indicates the act in question is unacceptable, transgressive, harmful, and generally “bad” (Triplett et al., 2016). Political actors, activists, or other groups with social agendas can make use of these connotations, and strategically label a certain act as violence, to reflect their view that it should be considered transgressive and unacceptable. That is, terms like violence do not simply reflect established boundaries of what is acceptable or not but are used in the process of negotiating the boundaries of (un)acceptable behavior (Boches & Cooney, 2022).
Second, support for the notion that public understanding of violence has expanded comes from the fact that we see this trend in academia too (Hartmann, 2017; Roodt, 2019). Beyond the “traditional” conceptualization of violence in the context of crime and political conflicts, we now have well-developed theories of structural violence (Galtung, 1969), symbolic violence (Bourdieu & Wacquant, 1992), environmental violence (Lynch, 1990), and more recently slow violence (Nixon, 2011). Similarly, Presser (2013) argues that eating meat should be considered a form of violence against nonhuman animals. Here we also gain some insight into what these changes look like concretely—the perspectives outlined above share the tendency to apply the term violence in a wider sense than before, such as including experiences of nonhuman animals, or considering nonphysical forms of violence.
Finally, support for the idea that these developments occur in the West, in particular, comes from the fact that in many Western societies the “violence landscape” has changed in an objective sense too. Particularly in Europe, rates of interpersonal violence are relatively low, compared to earlier time periods, and compared to other world regions. 1 Rates of homicide—the most extreme type of physical violence—have declined from hunter-gatherer societies, to early modernity, to the contemporary era (Eisner, 2003; Sumner et al., 2015; van Dijk & Tseloni, 2012). Evidence for such a decline can be seen both in terms of lethal violence (Aebi & Linde, 2014; Suonpää et al., 2022), and non-lethal violence (Moolenaar et al., 2023). In North America, interpersonal violence is not low per se, but there is evidence that the prevalence of lethal violence is declining in Canada (Farrell et al., 2018), and the U.S. (Sumner et al., 2015). Overall, then, in many Western societies individuals now have less exposure to violence than before. In terms of how this impacts understandings of violence, it has been shown that as the exposure to violent events goes down, so concern about it goes up, an effect known as the “prevalence hypothesis” (Cooney & Burt, 2008). That is, the reduced exposure to violence among the general public in the West (and Western Europe in particular) is accompanied by an increasing sensitivity to issues of violence (Kivivuori, 2014). This increasing sensitivity is evident not only in shock and concern in response to violent events (Furedi, 2006), but also in the understanding of the concept of violence itself. As exposure to traditional forms of violence decreases, this creates space for new perspectives on violence to arise, where the concept of violence takes on a meaning beyond its traditional definition (Kivivuori, 2014, see also Beck, 1992; Boutellier, 2004). Indeed, Durkheim already described such an effect—which he labeled the “homeostatic constant” (Durkheim, 1982/1895; see also Kivivuori, 2014).
In short, there are several lines of reasoning that converge on the notion that in contemporary Western societies the public understanding of what constitutes “violence” has taken on a broader meaning beyond traditional definitions of violence as physical injury in the context of crime or war. Importantly, these analyses have remained conceptual and are not yet supported by empirical data. The current work aims to address this issue and seeks to generate empirical insight into how violence is understood.
The Current Study
This study examines how violence is understood in contemporary Western societies. This study is exploratory and as such we did not raise any concrete hypotheses. Broadly speaking, however, we are interested in examining the idea that the public understanding of what constitutes violence includes issues beyond the traditional context of crime and political conflicts and that this tendency has become more pronounced over the years. We focus specifically on the Netherlands. We believe the Netherlands provides a reasonable case study for developments in Western contexts because the reasoning outlined above suggests that developments in the understanding of violence are driven at least in part by the fact that exposure to “traditional” forms of violence has been declining. As such, it seems sensible to study these processes in a context where the prevalence of violent victimization is low—such as the Netherlands (Aarten & Liem, 2021).
Concretely, we study how the term “violence” is used in discourse in the public domain, specifically in newspaper articles published between the years 2012 and 2021. That is, we take newspaper articles as an approximation of public attitudes. Although newspapers are not the only form of public discourse (in recent years especially the public discourse also takes place in the online sphere), we believe newspapers can reasonably capture general trends—especially those that are more settled than the fast-paced world of online media. We include newspapers with national coverage across the political spectrum (details below), to ensure we cover the full range of public opinion. To this data, we apply two distinct natural language processing techniques. The benefits of such techniques include the fact that they can efficiently process large volumes of data. Second, such techniques are data-driven—patterns are identified based on co-occurrences of words that are not evident to human observers. In the first stage of the analysis, we apply a topic model (Blei et al., 2003; Terman, 2017). Topic models analyze clusters of co-occurring words in a discourse, and thereby allow us to summarize a discourse in terms of key themes, or “topics.” In the second stage of the analysis, we capture the meaning of the term violence by considering its overlap with other terms, by employing word embeddings (Garg et al., 2018; Kozlowski et al., 2019). Word embeddings map terms in a high-dimensional continuous vector space, such that semantically related terms are located near each other. This distributional assumption was famously paraphrased by Firth (1957, p. 11) who wrote “You shall know a word by the company it keeps!.” The two techniques we apply are complementary—the topic model aims to understand the discourse as a whole, while word embeddings focus on understanding the meaning of individual terms. As such, their combined findings will give us a comprehensive overview of the meaning of the term “violence” in public discourse.
Methods
Dataset
Newspaper articles were retrieved from NexisUni, a widely used newspaper source in the study of violence (e.g., Lee & Douglas, 2021; Zeoli et al., 2019). We identified newspaper articles published in eight national newspapers in the Netherlands, between the years 2012 and 2021. These outlets are Volkskrant, NRC Handelsblad, het Parool, Trouw, Algemeen Dagblad, de Telegraaf, Nederlands Dagblad, and Reformatorisch Dagblad. These newspapers cover the full range of political ideology from left-oriented to right-oriented and also include three newspapers with a Christian religious orientation.
The search criteria centered on the term “violence” (in Dutch): geweld OR geweld*. Translated, these are: violence OR violen*—the latter term was included to identify compound nouns (such as gewelddadig or geweldsmisdrijf). 2 The range of publication dates was set to January 01, 2012 until December 31, 2021. This timeframe was chosen to strike a balance between a range that is long enough to detect changes in the use of the term, but short enough to generate stable topics and a manageable number of documents. Using the native filter options in the NexisUni database, we selected only newspaper articles and filtered out other content the newspapers might contain, such as advertisements. These search parameters identified 83,657 total documents. For each of the documents, we downloaded the full article text, as well as the document meta-data.
Analysis Plan
We examine the meaning of “violence” in public discourse in the Netherlands. Our analysis consists of two stages. In the first stage of the analysis, we apply a topic model. Topic models analyze clusters of co-occurring words in a discourse, and thereby allow us to summarize a discourse in terms of key themes, or “topics.” In this model, then, we capture the meaning of the term violence by studying the discourse within which it occurs. Specifically, we might expect to find topics related to the traditional understanding of violence, focusing on interpersonal violence, and political violence like wars. Beyond this, we are interested in the question of how the term violence is used outside this traditional context, and whether this section of the discourse grows over the 10 years under study here. In the second stage of the analysis, we use word embeddings to evaluate the meanings associated with the term “violence”. Word embeddings capture the meaning of a given word by evaluating its relationship with other words. That is, the word embeddings approach studies (relationships between) individual terms rather than discourse elements, as the topic model does. Word embeddings map terms in a high-dimensional continuous vector space, such that semantically related terms are located near each other. Here, we might expect words reflecting the “traditional” meanings of violence to be mapped close to the core of the concept, while the “non-traditional” meanings associated with violence occur in the periphery of the concept. In terms of changes over time, we examine whether these nontraditional meanings are “drawn in” to the realm of violence over the 10 years under study. These models are complementary—the topic model aims to understand the discourse as a whole, while word embeddings focus on understanding the meaning of individual terms. Below, we elaborate on the specifications of the models.
Topic Model
Topic modeling is an unsupervised machine learning technique that detects patterns in text. The topics consist of clusters of frequently co-occurring words. Here, we apply a structural topic model (STM), which is an extension of latent Dirichlet allocation (LDA) topic models that allow for correlations between topics, and the use of meta-data as covariates. We implemented the STM using the stm package in R (Roberts et al., 2019). The topic model was prepared as follows. We lowercase and stem all words in the corpus. We remove stop words such as “and” and “can”, as well as special characters (e.g., “%”) and interpunction (e.g., “,”). Additionally, we exclude infrequent words; those appearing in less than 10 of the documents (for similar pre-processing steps, see, for instance, Terman, 2017).
In the first stage of the analysis, we evaluate the number of topics required to optimally describe the discourse. The output of the analysis is shown in Figure A in the Supplemental Materials. The semantic coherence indicated a decrease after 20 topics. Therefore, we preferred a model including 20 topics—this model forms the basis of the analysis below.
We then fit the model itself using the STM package in R—to discover the content of these 20 topics. The technical details for the STM model fit using this package are given in Roberts et al. (2019). The primary parameters to be defined by the researcher using this technique are the seed, the initialization technique, and the maximum number of iterations after which convergence should be achieved. Topic models are estimated by setting a “seed” relative to which the other parameters are estimated. In other words, the seed represents a “starting point” for the model to start solving the relative distributions of words and documents over topics. Depending on the starting point chosen, the final model can differ, which is suboptimal from the point of view of reproducibility. In topic models with a fixed number of topics (as is the case for us, k = 20), reproducibility can be ensured by the use of spectral initialization—no matter what seed is set, the same results will be generated each time the model is run (Roberts et al., 2019). Therefore, we use spectral initialization here. Finally, convergence reflects the stage at which we reach the final model: each iteration improves on the model until the improvement is so small that further improvements are deemed negligible. We set a maximum of 75 iterations during model estimation. Convergence was reached after 33 iterations.
We validated the results of the topic model by assessing the coherence and interpretability of the topics by manually examining the top terms for each topic and ensuring their alignment with recognizable themes in the discourse. We also consider longer quotes associated with each of the topics to gain further insight into their coherence. Additionally, we used semantic coherence scores to determine the optimal number of topics.
In interpreting the topic model, we consider those words with high frequencies within the topic, as well as words that are unique to a certain topic relative to the others. Based on these, the topic is assigned an interpretive label. Each topic is associated with a proportion—topics with higher proportions are more strongly represented in the discourse as a whole than topics with smaller proportions. A single document can contribute to multiple topics. Finally, it is worth noting that the fact that an article mentions “violence” does not mean that the article is fully about violence, just that the term is included in the article. Hence, the discourse we study is not necessarily about violence, but rather, the discursive space in which the term violence occurs. The topic model can also speak to change over time, by evaluating how the relative prevalence of the topics in the discourse develops over the years under study here. To examine changes in the discourse over time, we create a linear regression model in which the relative prevalence of the 20 topics is predicted by the year the article was published.
Word Embeddings
We use a Word2Vec word embedding model (Le & Mikolov, 2014) that is trained on our dataset. The model maps words in a high-dimensional space such that the embeddings of semantically related words are located near each other. This property allows for the retrieval of words that are semantically related to a query word (in our case “violence” and “violent”), by giving a similarity score—cosine similarity. The cosine similarity reflects the conceptual similarity between the terms, as well as in part their grammatical function—nouns are more closely embedded with other nouns than verbs. Cosine similarity ranges from −1 (fully different) to 1 (fully the same).
In the central part of the analysis, we extract a list of the 500 terms with the strongest cosine similarity to “violence” and “violent.” We extract a relatively long list so that the list includes not only terms with a strong association to violence but also terms in the mid-range to get a sense of terms in the periphery of the concept. The 500 closest words are based on 200-dimensional embeddings. We then used t-SNE to reduce the embeddings’ dimensionalities from 200 to 2, to allow for visualization in a two-dimensional graph. To visualize the peripheral terms, we calculate the range within most words fall (standard deviation [SD]) and suppress those terms that are within 1 SD from the mean on either dimension. This results in a figure in which the core of the concept is suppressed, and only terms on the diagonal remain. As in the topic model, we also assess changes over time in the embedding of “violence.” In this case, changes over time reflect changes in the association between “violence” and other words—associations may become closer or more distant over time. We create a linear regression model where the year in which the article is published is included as a linear term and examine its impact on the relationship between the term violence/violent and its associated terms.
We validated the results of the word embeddings qualitatively—by evaluating the relationships between key terms (“violence” and “violent”) and their closest neighbors through interpretive assessment. The list of similar words was analyzed to confirm that the relationships identified were meaningful within the context of the discourse, ensuring their semantic relevance.
Note on Translation
The original Dutch text was used in the analysis, translations were applied at the stage of manuscript writing. There is one feature of the Dutch language that is worth keeping in mind in interpreting the results of the word embeddings described below. When it comes to adjectives, proper adjectives (e.g., gevaarlijke) and predicative adjectives (e.g., gevaarlijk) in English are indistinguishable when taken out of the context of the sentence, whereas in Dutch they have different forms. This means that in their English translation, these are rendered as duplicates, but in Dutch they are not.
Preliminary Findings
The corpus consists of 83,657 documents over the 10-year time frame so we have 8,366 documents per year on average. The year with the most documents was 2015 with 9,155 articles (10% above average), and the year with the fewest articles was 2017 with 7,795 total documents (7% below average). The first 5 years (2012–2016) all generated above-average number of documents, and the last 5 years (2017–2021) all generated below-average number of documents. That is, there is some evidence of a decline in mentions of violence in newspaper articles over the years. This might be explained by the decline in total newspaper articles in this period—from 297,752 articles in 2,012 to 213,208 articles in 2021. This overall decline in the volume of articles was in fact stronger than the decline in articles mentioning violence. This means that by 2021 a larger percentage of all articles mention violence: 3.71% in 2021 versus 2.96% in 2012.
Results
Topic Model
The topic model shows 20 different themes that contain references to violence. In what follows, we describe the main observations arising from the topic model. Table A in the Supplemental Materials gives details on each on each of the topics. Figure B (also in Supplemental Materials) maps the topics in relation to one another, based on their correlations.
Among the 20 topics identified by the model, the most prevalent topic was a miscellaneous topic that focused on pronouns (“me,” “mine,” as well as “someone”) and verb forms (“have,” “think”). This topic may have arisen as a result of a subgroup of articles with a more informal approach, or interview style, describing people’s personal experiences. This was confirmed when exploring slightly longer quotes: “Reading newspapers, watching television—I hardly do any of that anymore. War, disasters, violence, these are all things I can’t do anything about. I get [expletive]depressed from the endless amount of sensational reporting, the manufactured outrage. I feel like the media is increasingly focused on polarization.”
In terms of the content-related topics, several topics that we might intuitively expect to find in the discourse on violence were represented here. Some of the most prevalent topics in the discourse were—for instance—a War topic where central terms included “war,” “weapon,” “Syria.” There were also two topics centering on (responses to) interpersonal violence: a Police topic (“police”; “officer”; “arrested”) as well as a topic reflecting the Legal System (“court”; “lawyer”). Then, there were several topics centering on specific kinds of violence, both political and interpersonal. On the side of interpersonal violence, there were topics such as Children and Young people (“school”; “child abuse”; “Child Protective Services”) and family relationships (“mother”; “father”). There was also a Public Order topic (“nuisance”; “order”; “city”). Inspection of the quotes associated with this topic generated examples of discourse about legislation introduced to reduce the nuisance and violence around nightlife in the city: “A few years ago, the alcohol ban behind [the square] was lifted [. . .] This is a popular place to sit down with a drink, which has recently led to complaints of noise, public urination, loitering and violence.” On the side of political violence, there were topics centering on Electoral violence (e.g., “election”), Human Rights (e.g., “Human Rights”; “Rohingya”), and Islamic Terrorism (e.g., “terrorist”; “Caliphate”). There were also two topics focusing on responses to ongoing conflicts, both political and economic/financial. Specifically, there was a topic we labeled European politics, which included terms such as “European Union” and “refugee.” In the Economy topic, we see references to violence in discourse around business, economics, and finance (“Euro”; “Shell”; but also “Climate change”). Inspection of the quotes associated with this topic demonstrated that there were discussions of the impact of violent conflicts on (e.g.,) stock markets. There was also a topic centering on the Israel/Palestine conflict (“Palestinian”; “conflict”; “Jerusalem”).
The remaining seven topics required a little more analysis. There were four topics where references to violence occur in cultural discourse, specifically Sport (“football”; “match”), Art and Entertainment (“film”; “Tarantino”; “artist”), Historical work (“museum”; “history”; “novel”), and in the Christian religion (“church”; “God”). Finally, there was a section of the discourse where the term violence appears in discussions of wider socio-cultural challenges. In the Social Media topic (“Social media”; “fake news”; “internet”) we find discourse around the challenges presented by Social Media, including the idea that social media glorifies and encourages violence. In the U.S. affairs topic, we find discourse about the political polarization during the 2016–2020 presidency of Donald Trump, including the accusation that he had been “inciting violence,” as evidenced by the quote “He said on Friday that his colleague, and Speaker of the House of Representatives, Nancy Pelosi, will request the Senate on Monday to impeach Trump for incitement to violence and insurrection. Trump is said to have provoked the storming of the Capitol on January 6.” A second component of the U.S. Affairs topic was a focus on racism. Key terms were “George Floyd”; “Black” and contained statements such as “Activist group 'Black Lives Matter’ is demanding commitments from [political] candidates that they will put an end to police violence against black Americans.” Similarly, on the topic of Gender and Sexuality, #MeToo was among the most informative terms, alongside references to the LGBT+ community (“gay”). In this section of the discourse, then, references to violence appear in discourse about societal challenges, including social media, political polarization, and social injustice faced by ethnic minorities, women, and LGBT+ groups.
We then considered change over time in the prevalence of topics. Table 1 shows the results of the t-tests evaluating the effect of year of publication on the relative prevalence of each topic. These are relative changes (in % of the total volume)—if one topic shows an increase this must mean other topics are showing a decrease. From Table 1 we see that nearly all topics show significant changes over time. However, the effect estimates are very small—the robustness of the effect seems to arise primarily as a result of the large dataset. Therefore, we interpret only the strongest effects (t-value above 20 or below –20; highlighted in bold in Table 1). The two topics that showed the strongest increases were U.S. Affairs and Social Media. These increases came (primarily) at the expense of the topics of War and Islamic Terrorism, which showed significant decreases. This suggests that, over the 10 years under study here, the usage of the term violence seems to shift somewhat away from political contexts, and increasingly appears in wider discussions surrounding societal challenges, such as social media, political polarization, and racism. We further explored whether these trends were comparable in newspapers with more conservative and more progressive orientations—this exploratory analysis is reported in the Supplemental Materials.
Relative Change Over Time for Each of the 20 Topics. The t-Test Evaluates the Effect of the Covariate Year on the Topic Proportion.
Note. The effects shown in bold are discussed further in the text. CI = confidence interval.
Word Embeddings
The word embeddings focused on the meaning of the term violence itself. As for the topic model, we first provide an overview, and then consider change over time. Results showed a clear focus on terms with negative valence, in line with the idea that violence is a concept with a negative valence. Further, the results showed a broad distinction between terms that are directly linked to violence and more associative connections. Terms that were more central to the concept of violence (as evidenced by higher cosine similarities) tended to have very explicit connections to violence (“firearm violence”; “street violence”; “war violence”; “aggressive”; “criminal”). 3 Conversely, words with more restricted meanings (e.g., “human rights violation”; “blood thirsty”), or words with more associative connections to violence (e.g., “intolerance”; “persecution”; “racist”; “unhappy”), were associated with cosine similarities in the mid-range. To further examine the “periphery of the concept of violence we plot the cosine similarities in two dimensions. We then suppressed the most central terms (less than 1 SD from the mean on either axis), leaving only those terms that are on the diagonal, away from the “core” of the concept. This resulted in Figure 1. In the figure, we see some miscellaneous terms (“recent”), and some general ones (“dangerous”). The more informative terms belong to two clusters—extreme ideologies (extreme-right, jihadist, extreme-left, radical, etc.) in the top-right quadrant, and social injustice (injustice; racist; antisemitic) in the top-left and bottom-left quadrants. We also see police violence in the bottom-left quadrant.

Terms in the periphery of Violen* (at the origin).
In the final step of the analysis, we explore change over time, using the year of publication of the article. We create three compounds by taking the average cosine similarities of the terms in each cluster: extremism, social injustice, and miscellaneous. We apply a linear regression model, in which cosine similarity scores are predicted by the year of publication and the different compounds. Results showed that the year of publication affected cosine similarities for the social injustice compound, B = 0.011, 95% CI [0.005, 0.017], t(38) = 3.64, p = .001. The same effect did not reach significance for the extremism compound, B = −0.0008, 95% CI [−0.005, 0.003], t(157) = −0.403, p = .687, nor for the miscellaneous compound, B = 0.0009, 95% CI [−0.003; 0.005], t(115) = 0.392, p = .696. As the years progress the relationship between the concept of violence and the concept of social injustice is increasingly close. This effect is not seen for the other compounds.
Discussion
This work explores how the concept of violence is understood in contemporary Western societies, taking the Netherlands as a case study. We explore the discourse on violence used in national newspapers, through novel machine learning techniques—notably a topic model, and word embeddings. The topic model aims to understand the discourse as a whole, while word embeddings focus on understanding the meaning associated with individual terms. As such, their combined findings give us a comprehensive overview of the meaning of the term “violence” in the public discourse.
The topic model demonstrated that the discourse on violence in the public domain in the Netherlands has a subsection that can be described as the familiar or “traditional” usage of the term violence—referring to violent political conflicts, or interpersonal violence like child abuse. These topics covered the majority of the discourse (approximately 70%). Alongside this, we see that the term violence is also used in cultural discourse around history, entertainment, and the like (20%). Finally, there is a subsection of the discourse (10%) where references to violence appear in discourse about societal challenges, including Social Media, political polarization, and social injustice faced by ethnic minorities, women, and the LGBT+ community. These topics were among the least prevalent in the discourse, but their prevalence did increase over time (particularly the Social Media and U.S. Affairs topics), at the expense of the Terrorism and War topics. As such, there was some evidence that the term violence increasingly appears beyond the context of war and crime, in a wider socio-cultural usage. The link between violence and issues of social injustice in particular calls to mind the concept of structural violence (and the like), as a counterpart to political and interpersonal violence in the other topics.
References to violence beyond the traditional realm of crime and war were also evident in the word embeddings. The core of the concept consisted of terms with a very direct relationship to violence, including both interpersonal and political exemplars. However, in the periphery of the concept the issue of social injustice re-appears (alongside extremist ideologies). The social injustice terms in particular were increasingly drawn into the realm of violence as the years progressed, so that by the late 2010s and early 2020s, there was a significantly stronger association between violence and social injustice than in the early 2010s. Overall, then, the findings suggest that the association between violence and social injustice grows over time, both in terms of the discourse within which the term is used and in the meaning associated with the term itself.
As noted above, the connection between violence and social injustice is key to frameworks of structural violence (Galtung, 1969) and symbolic violence (Bourdieu & Wacquant, 1992) within academia. These perspectives argue that social injustice causes extensive harm and oppression to vulnerable groups in our societies, to such an extent that social injustice is not only linked to violence (e.g., social injustice as a trigger for riots) but social injustice is a form of violence in itself. These perspectives have a comparatively long history in academic sociological research, but based on the findings from the current study we might say that we are seeing increasing recognition of such understandings among the general public too. As outlined in the introduction, this development is likely made possible (at least in part) by reduced exposure to “traditional” forms of violence as a result of crime or war. In such a situation, social injustice is an ideal candidate for inclusion under broader conceptualizations of violence. It produces significant and long-lasting harm that is not physical per se, but instead takes the form of reduced access to employment and education (Triventi, 2013), as well as poorer health and social stigmatization (Abman et al., 2020). That is, it taps into expanded notions of harm as well as violence (Hall & Winlow, 2018; Presser, 2013). Given that experiences of social injustice are related to a person’s group membership in a certain disadvantaged category, it also seems inherently problematic from a Western cultural context that is based on notions of meritocracy and individualism (Littler, 2018). In short, the connection between violence and social injustice builds on several concurrent trends in public attitudes, but also in academic thinking.
Diversity: Group Differences
This work examines public attitudes to violence as conveyed in national newspapers in the Netherlands. Groups that hold greater social and cultural capital in society are more able to shape the public debate on a given topic (Van Dijk, 2013). As such, we might expect that public attitudes (as reflected in newspapers) represent the attitudes of dominant groups more than marginalized groups (Fleras, 2011). If we see public attitudes as primarily the attitudes of dominant groups, it is interesting to note that marginalized groups loom relatively large in the discourse on violence, including the LGBTQI+ community, women, and racialized minorities. Within this discourse, members of disadvantaged groups are represented primarily as victims of violence and transgressive treatment. Above, we have interpreted this as a type of sensitization among the general public to the harmful consequences of social injustice. Alternatively, we might describe this process as a sensitization among members of dominant groups, to how social injustice and violence come together in the lives of marginalized groups. This does not mean that everyone accepts the relevance of injustice and marginalized groups to debates around violence—but rather that the connection is increasingly being discussed in the public domain, with some who see it as highly relevant in debates around violence, and others who dismiss it. To summarize, then, issues of social injustice and diversity play a central role in this work.
Related to the issue of diversity in the data, exploratory analysis of differences between the newspapers included in our set (described in the Supplemental Materials) indicated that the tendency to use the term violence in a broad sense was more prevalent among progressive newspapers compared to conservative newspapers. Importantly, it is not necessary for our argument that there is full consensus on how the term violence can be used. Instead, we argue that discourse around violence is subject to a process of social construction of the term, within which there is space for disagreement (see Boches & Cooney, 2022). Further, to the extent that we see “progressive” as indicating an orientation toward change and development, these findings support our analysis of change over time.
Another reflection point is related to the Netherlands as the study context. The Netherlands is a medium-size country (in terms of population) and thus the newspaper readership is smaller than in countries with larger populations. This is relevant to the study methodology; in the Netherlands, there is a limited set of national newspapers, and as such, with comparatively few sources we can capture the public discourse comprehensively. In terms of the impact of the Dutch context on our results, we do not believe there are concrete reasons to suspect that trends in the Netherlands are out of keeping with other countries in Western Europe, or the West more generally. That said, each country has its own unique features. Here, this was evident in the History topic, where we saw considerable discussion of Dutch colonial violence perpetrated in Indonesia. We encourage similar research across different country contexts.
Limitations
There are several limitations associated with this work. First, we have taken newspaper articles as a representation of the public discourse on violence. However, newspapers are not the only form of public discourse, in recent years especially public discourse also taken place in the online sphere. We believe newspapers can reasonably capture general trends—especially those that are more settled than the fast-paced world of online media. Further, newspapers are relatively neutral compared to social media, as their copy is produced professionally. This likely tones down any extreme views (in both a progressive or conservative direction). In this way, we believe that our use of newspapers represents quite a conservative test of our hypothesis. To speculate briefly on what our findings might look like if we had considered texts sourced from social media—we would expect the patterns identified here to be more pronounced due to the more extreme and less professional nature of social media compared to newspapers. More specifically, we might also expect the concerns expressed in the Social Media topic to be less pronounced when sourcing texts from online sources.
A second limitation of this work, related to the one above, is that we have interpreted the findings as reflecting public attitudes or social constructions on the topic of violence, but newspaper articles are not a pure measure of public attitudes. The discourse in newspapers is constrained in part by objective news events. That is, when we see a decline in the prevalence of certain topics over the years, this may reflect a decrease in public concern, but also a decrease in objective instances of the event.
Future Research Directions
The limitations of the current study indicate some areas where further research is needed. First, future work might consider developments over longer timeframes than the 10 years under study here. Even the short timeframe used in the current study generated a lot of documents to be analyzed. As such, follow-up studies with longer timeframes might require more specific search terms and narrower research questions, to constrain the number of documents to be analyzed. Second, future work might examine to what extent the developments outlined here are evident in other countries and cultural contexts beyond the Netherlands, which has been our focus here. Third, related to our use of newspaper data, further work is needed that captures public attitudes more directly, for instance through surveys or experimental work (see e.g., Triplett et al., 2016), or indeed through other types of media such as posts on social media platforms. Finally, in this work, we cannot capture the directionality of the relationship between violence and social injustice. That is, it remains unknown whether violence is expanding toward issues of social injustice or whether discussions of social injustice are shifting toward a focus on violence. Our initial conjecture is that these discussions feed each other, but a more definitive answer to this issue requires further testing.
Contributions and Implications
We believe that some of the implications of the developments outlined in this work are already evident in praxis around violence. As we have seen, the concept of violence seems to be expanding, and such expanded notions of violence can trigger greater concern about violence (Cooney & Burt, 2008). In line with this, we have seen an increase in public demand for policies and legislation that aim to contain “new” forms of violence, such as for instance the case of “upskirting” in the U.K. (Gillespie, 2019; Yar & Drew, 2019). The extent to which legal practice can and should reflect public priorities in the domain of transgression, violence, and safety is a key consideration for legal scholars. This work can inform such considerations by highlighting how norms in this domain evolve, as well as by underpinning these developments with empirical data.
Second, we hope that the current paper will contribute to demonstrating the benefits of machine learning techniques in answering research questions arising from the field of violence studies. Word embeddings and related machine learning techniques have been used to study topics related to crime, abuse, and violence (see e.g., Al-Hashedi et al., 2019; Bonisoli et al., 2021; Park & Kim, 2021), but this work arises primarily from the computer sciences literature—as such the contribution of this work is often framed in terms of the usability of the models and developments in modeling, rather than the applicability of the findings to the field of violence studies.
In sum, we believe that this work makes several key contributions to the study of violence. First—of course—it offers reflections on how public attitudes to violence are evolving, by highlighting the connection with social injustice in the public perception. A second (and related) contribution of this work lies in the fact that it generates empirical evidence to support conceptual analyses and observations from previous work. In reviewing the results of a large research initiative on violence, Stanko (2003) commented on the tendency for issues of social injustice to crop up across the contexts under study. She suggests that the increasing intolerance of violence reflects—at least in part—an intolerance of inequality (see Stanko, 2003, p. 4)—in line with the findings of the current study. Regarding the contribution of this paper to the study of violence more generally, we believe the current work highlights a relevant contribution to be made by work from low-violence contexts, such as is found in many Western (European) countries. Finally, we outline a novel technique by which to study these high-level societal developments, namely by the application of machine learning techniques. This approach has developed markedly in recent years, but to the best of our knowledge has not been applied to the understanding of violence until now.
Supplemental Material
sj-docx-1-jiv-10.1177_08862605241301793 – Supplemental material for Examining the Meaning of “Violence” Through Machine Learning Techniques
Supplemental material, sj-docx-1-jiv-10.1177_08862605241301793 for Examining the Meaning of “Violence” Through Machine Learning Techniques by Jolien van Breen, Emil Rijcken, Jaroslaw Kantorowicz and Marieke Liem in Journal of Interpersonal Violence
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interests with respect to the authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: This work was supported by the Dutch Science Foundation (NWO—grant number 406.XS.03.036).
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
