Abstract
Civil wars are more than battles between governments and rebels; they involve a multitude of actors who perpetrate different forms of violence linked to the war in some way. However, scholars often study different types of violence perpetrated in wars in isolation rather than in their interrelationship—a compartmentalization that further widens the gap between research and practice. Despite the wide availability of disaggregated data, linking various forms of violence to one another and attributing them to a specific civil war remains a challenge. This article discusses the violence attribution problem and introduces an approach that connects different forms of war-related violence to specific civil wars using data on actors, event locations, and conflict zones. In a practical application of this approach, we introduce an R package that integrates battle and war-related violence data collected by the Uppsala Conflict Data Program. The overall aim is to provide a more comprehensive measure of the violence taking place in a particular war and facilitate a better understanding of the dynamics and interrelations of different types of violence within civil wars.
Introduction
Civil wars are more than battles between rebels and a government. They involve a complex network of actors perpetrating violence that is intricately tied to the ongoing conflict. Rebel groups battle each other, combatants abuse civilians, communal groups fight, and criminal organizations compete over the spoils of war. Propelled by access to disaggregated data with information on the exact location and date of violent events (Donnay et al., 2019; Gleditsch et al., 2014), scholars generate much needed knowledge on the causes, dynamics, and consequences of multiple forms of political violence at the sub-national level. However, these quantitative-comparative studies usually examine distinct forms of violence in isolation, overlooking their interrelationships within the broader conflict context (Bara et al., 2021: 929).
This compartmentalization widens the gap between researchers and case experts, who take into account the complex web of violence in one conflict. It also means that important and policy-relevant questions are not answered: under what conditions do actors in civil wars choose one violent strategy over another? Do some interventions to end violence transform violence instead? How can we make sure that all war-related violence stops when wars end? To answer such questions, scholars have repeatedly called for an integrated approach to studying political violence (Bara et al., 2021; Findley and Young, 2012; Kalyvas, 2019; Straus, 2012).
Conceptually, scholars have answered this call and offered frameworks for analyzing violence dynamics in complex civil war settings (see, e.g., Brosché et al., 2023; Staniland, 2017). Empirically, however, the gap between the study of political violence and the study of conflict remains. It is hard to link different forms of violence to one another and to specific civil wars. Information on various forms of violence is in different datasets with distinct typologies and coverage. Even if these can be computationally integrated (Donnay et al., 2019), scholars analyzing the dynamics of violence within civil wars face a problem of attribution: how do we know that an act of violence is related to a particular war? After all, violence also occurs outside of wars. Moreover, some countries experience multiple simultaneous rebellions, and a violent event could be related to any of them. Finally, while some violence is clearly war-related, for instance, when combatants use violent tactics other than battle to achieve their aims, in other instances the link is more tenuous. 1 Without the ability to attribute most related violence to a war, it is challenging to measure how violent one war is compared to others.
In this article, we present an approach to link various forms of war-related violence to specific civil wars by leveraging information on the actors and the locations of the violence.
2
First, we discuss the violence attribution problem in greater detail. Then, we describe our approach, implemented in the R package
Civil war complexity and data attribution
Studying violence within complex civil wars is empirically challenging, even within the streamlined and internally coherent UCDP data universe. The launch of the UCDP/PRIO ACD in 2002 set off a wave of research on the conflict level, examining the causes and effects of violent conflict (Gleditsch et al., 2014). This data includes no information on violent events, only a categorical measure of the intensity of the conflict. When fine-grained violence event data became available in the early 2010s, for instance, with the GED, detailed research on different kinds of political violence on the sub-national level emerged (Balcells and Justino, 2014). Yet leveraging information on the latter to study the former is surprisingly difficult.
This difficulty stems from a double attribution problem. First, responsibility for an event must be assigned to an actor. Others have explored this issue, particularly the challenge of assigning responsibility for unclaimed terror attacks (Bauer et al., 2017) or when militias are used to avoid accountability (Carey et al., 2016). We rely on existing UCDP data and do not address this first challenge. Our focus is the second problem: linking events to specific civil wars. Many events recorded in the UCDP GED cannot easily be attributed to a particular conflict recorded in the UCDP ACD. For instance, the government of Myanmar has killed thousands of civilians, yet there is no indicator in the UCDP GED that specifies whether these killings were related to any of the ethnic conflicts in Myanmar, and if yes, to which one. Here the challenge is that the government is not a uniquely identified actor, as it fights in multiple conflicts within the same country. To study government violence against civilians in civil wars, some scholars have thus attributed all such killings in the country to each conflict or divided casualty numbers equally between conflicts (e.g., Hultman et al., 2013; Kathman and Wood, 2016). This discards variation in government killings across conflicts and assumes all civilian deaths are war casualties. Additionally, violence by actors never challenging the government cannot be linked to conflicts via an actor identifier at all. This includes pro-government militias, irregular armed groups, gangs and criminals, politicians and their supporters, or civilians and their communal groups.
These challenges are reflected in existing civil war research. To date, quantitative scholars have primarily studied war-related violence in the context of civil wars where attribution to a conflict is comparatively easy. That is the case when violence can be linked to civil wars via the actors involved, for instance, when rebel groups target civilians (Hultman et al., 2013; Kathman and Wood, 2016) or use terrorist tactics (Fortna et al., 2020). Others have leveraged the detailed location coding of violent events to study violence on a sub-national level, for example, comparing regions or grid cells. This has proven effective for studying the micro-foundations of political violence, but this scholarship has yet to explore the specific connections between sub-national violence dynamics and broader conflict dynamics (Balcells and Justino, 2014: 1345).
Linking UCDP GED events to ACD conflicts
We leverage both actor and location information to connect data on deadly organized violence collected in the UCDP GED to civil wars recorded in the UCDP/PRIO ACD. Specifically, we consider GED events as war-related if they are either perpetrated by the war’s main protagonists or if they occur in the conflict zone of a civil war. Two studies have used a similar approach to study terrorism (Findley and Young, 2012) and peacekeeping (Bara, 2020) in complex conflict settings. To enable more studies of this type, we provide a standardized approach, easily executed in an R package.
Civil wars are defined—by the UCDP and others—by
Our two-step process to link such violence events to conflicts is shown in Figure 1. First, we link data via actors where possible. In the UCDP GED, this is only possible for violence by and between rebel groups that fight the government in a civil war and are therefore listed as actors in the ACD. Second, where actor linking is not possible, we intersect the coordinates of a possibly war-related event with the location of the civil war, that is, the conflict zones. Events within a conflict zone are considered related to that war, those outside are not.
5
Process to attribute violence events to civil wars as war-related violence, with India example.
The map in Figure 1 illustrates the result for four events in India in the mid-2000s, none of which can be linked via an actor ID. Events A and B are instances of government violence against civilians, and we are able—using the location of the event—to attribute each to the conflict context in which it took place, that is, the Kashmir and Naxalite wars. These are true positives, so to say: events correctly attributed to a conflict. Event D is not attributed to any civil war, and rightfully so, as this was an event of Hindu-Muslim riots not related to any of India’s civil wars. This is a true negative: an event correctly identified as not war-related. Event C, a false negative, shows a limitation of the approach. The 2007 Samjhauta Express bombing close to the capital killed 70 civilians and was later attributed to at least one group who has its main area of operations in Kashmir. This terror attack can in many ways be considered related to the Kashmir conflict, but we cannot attribute it: first, because the perpetrator is not coded as an opponent of the government by the UCDP and second, because it takes place far away from the conflict zone.
Our approach may therefore miss some war-related events, but there is no “ground truth” to ascertain this. If we knew what is war-related and what not, our approach would not be necessary. Whether an event would have occurred without the war is a counterfactual that cannot be observed. Moreover, having more case knowledge does not necessarily help, as Kalyvas (2003: 476) notes: “the more detailed the facts, the bigger the difficulty in establishing the ‘true’ motives and issues on the ground (…).” Nevertheless, in the Supplemental File we show that events that are most likely war-related (because the perpetrators are war combatants) predominantly occur within conflict zones. This suggests that conflict zones can be used to effectively classify other violent events as war-related, provided they are static enough to account for the fact that war-related violence may occur in the same area as battles, but not necessarily at the same time. Our zones are constant through a conflict episode, defined by the UCDP as a period of continuous conflict separated by no more than a year of inactivity. This acknowledges that the location of the conflict may shift between episodes.
Besides “missing” some events, our approach also assumes that events occurring in the conflict zone are indeed war-related. Again, this assumption is difficult to test without ground truth. Thus, when we classify violence as war-related, we infer what we do not know (whether the event would have happened without the war) from what we do know: actors’ links to a war and/or the location of the violence. It should also be noted that the zones of some conflicts partially overlap, and violence in these overlapping areas is attributed to more than one conflict, which may (in regional conflict complexes) or may not reflect the reality on the ground. 6
Our approach links 81% of events in GED to a civil war. Among unlinked events, a quarter are battles in inter-state wars, that is, they could be linked if we would include these conflicts. The rest remains unattributed for several reasons, but mostly because they happen outside civil war contexts. The majority concern drug-related violence in Latin America. Some events may be related to a war but either we have no conflict zone for it, or the conflict zone does not capture this violence. 7 The extreme case is Rwanda, where not all events of the Rwandan genocide are captured because the conflict zone does not cover the whole country.
The dataset that results from our linking procedure is a conflict-month dataset of battle and war-related violence from 1989 to 2023, 8 covering 181 civil wars over 366 distinct episodes. Each conflict is observed monthly, and users can choose to include any number of inactive (post- or interwar) months as well. 9
Illustration
In this section, we illustrate how our data captures the variety of violence in wars and how it offers a more comprehensive, and less biased, measure of the severity of war. Distribution of deaths from main types of violence recorded in the data. One square represents 10,000 people killed.
Data illustration 1: Conflict complexity
Figure 2(A) shows, unsurprisingly, that military battles are the dominant form of violence in war, accounting for three quarters (77%) of all deaths, including 9% collateral civilian deaths. A significant portion (18.5%) of deaths results from the intentional targeting of civilians by the warring parties and other armed groups (Figure 2(B)). Finally, 4.5% of deaths occur in fighting between armed or communal groups (Figure 2(C)). These numbers underscore the complexity of wars and the substantial share of deaths from war-related violence. Importantly, more than half of deaths incurred in this war-related violence (57%) occur in events not easily linked to civil wars by actor, and where our spatial approach is needed to capture violence that would otherwise be missed. Patterns of violence across two wars in India.
The violence types that dominate the conflict landscape vary between conflicts and over time within the same conflict. For instance, in India, the Kashmir war is dominated by military battles and civilian targeting, whereas the conflict over Nagaland shows a different dynamic (see Figure 3). By standards common in quantitative research, the Naga war was only active from 1992 to 1997 and in 2000. Even outside these years, however, the conflict involved high levels of non-state violence, especially communal violence between the Kuki and Naga tribes that was linked to both groups’ fight for independence (UCDP, 2024). As battlefield violence decreased and ceasefires were reached, the nature of violence shifted to fighting between different groups and factions claiming to represent the Naga cause, with civilians continuing to be targeted. While case experts have studied these dynamics within India (Kolås, 2011; Staniland, 2017), our data provides an opportunity for scholars in the quantitative tradition to study such dynamics across a broader set of cases.
Data illustration 2: Measuring conflict severity
Our data can also be used as a more comprehensive measure of conflict severity by summing up deaths from battle and war-related violence. 10 To date, battle deaths are often used as a proxy for a war’s overall intensity due to the absence of indicators that capture a broader spectrum of violence in war. This is not per se a problem. Given that battle deaths constitute a large portion of violence in wars (Figure 2), we believe that the results of most previous studies on conflict intensity would remain valid if battle deaths were replaced with our more comprehensive metric. Still, it is crucial to recognize that conflicts characterized by intense military battles are not always those with most war-related violence. Battle deaths poorly predict war-related violence, explaining only 17% of its variation on the conflict-month level and 33% on the conflict-year level.
Moreover, our analyses suggest that battle deaths capture the intensity of a war more accurately for some conflicts than for others. As the first graph in Figure 4 shows, in wars like Afghanistan, Syria, and Iraq, battle deaths align well with overall intensity. In conflicts in the DRC, Sudan, or Rwanda, in contrast, relying on battle deaths significantly underestimates the total death toll. Using battle deaths as a proxy for overall war severity can therefore lead to biased conclusions, particularly when comparing different types of conflicts. Civil wars in which strong, well-armed rebels engage in conventional warfare tend to show a smaller discrepancy between battle deaths and overall intensity than insurgencies or symmetrical nonconventional wars (Balcells and Kalyvas, 2014), where war-related violence significantly contributes to the overall deadliness. This is illustrated in the second graph in Figure 4, which shows that battle deaths align more closely with overall deaths in conventional wars compared to other types of conflicts. Using newer data on rebel access to Major Conventional Weapons (MCW, Pamp et al. (2024)), often a key criterion for defining conventional wars, further substantiates this finding. Studies that rely on battle deaths to measure conflict severity may therefore not capture the variation in the severity of insurgencies and other nonconventional wars as accurately as they do for conventionally fought civil wars. (a) Difference between battle (gray) and total violence (black) in the 20 conflicts with most fatalities, fatalities logged. (b) Violence beyond battle deaths depending on conflict type. Percentage increases in graph refer to non-log-transformed violence data. Source data in conflict-years.
Conclusion
Civil wars continue to increase in complexity—regarding the number of actors involved, the disputed issues, and the multitude of violent tactics used by all parties (Brosché et al., 2023). To fully understand these wars, comparative researchers advocate for a more integrated approach to studying violence dynamics in armed conflict. In this article, we have presented a step toward facilitating this endeavor.
While our approach offers a principled method for linking various forms of violence to wars, it has limitations. Specifically, it works best with data from the UCDP universe, which limits the types of war-related violence that can be studied to lethal violence that is highly organized. While it is possible to extend our approach to link violence from other georeferenced event datasets, this requires additional considerations. First, matching actors across datasets requires a name translation process, which, while feasible, is time-consuming (see Fortna et al., 2020). Second, when attributing violence from datasets with different but partially overlapping violence definitions, event disambiguation must precede our approach to avoid counting the same event multiple times (Donnay et al., 2019). Third, discrepancies in geolocating and temporal recording across datasets must be addressed for consistent integration.
We see the biggest potential in our data and R package in enabling scholars to simultaneously study multiple types of violence, to answer questions such as: Does violence tend to return to the community level once the “big” war ends (Brosché and Elfversson, 2012)? Do peace negotiations provoke violence from excluded actors who demand a seat at the negotiation table (Sisk, 2009: 3)? Are there forms of violence that are used as substitutes (rather than complements) to achieve war objectives? These are highly relevant questions, as is the question of how deadly a war generally is—a question to which battle deaths alone can only give a partial answer.
Supplemental Material
Supplemental Material - Who, what, and where? Linking violence to civil wars
Supplemental Material for Who, what, and where? Linking violence to civil wars by Corinne Bara and Maurice P. Schumann in Research & Politics.
Supplemental Material
Supplemental Material - Who, what, and where? Linking violence to civil wars
Supplemental Material for Who, what, and where? Linking violence to civil wars by Corinne Bara and Maurice P. Schumann in Research & Politics.
Footnotes
Acknowledgments
We are grateful for feedback on earlier versions of this manuscript from colleagues at the Hertie School International Security Research Colloquium, 2023, and the Annual Conference of the Swiss Political Science Association in St Gallen, 2024. Special thanks to Samantha Gamez and Christian Gläßel for close reads and detailed feedback.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: this work was supported by the Vetenskapsradet; 2020-03936.
Supplemental Material
Carnegie Corporation of New York Grant
This publication was made possible (in part) by a grant from the Carnegie Corporation of New York. The statements made and views expressed are solely the responsibility of the author.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
