Abstract
This article presents the Myanmar Protest Event Dataset, a unique dataset on protest assemblies in transitional Myanmar/Burma. The data contents were derived from the most visible forms of assembly – demonstrations, protest marches and labour strikes – and collected through a protest event analysis of local news reports. The coded variables range from information on the actual moment of the protest event, such as participants, issue, duration and location, to the aftermath, including variables related to legal consequences for protesters and the success of protesters’ claims, and many others. Besides a concise description of the research design and data collection process, this article discusses methodological strengths and weaknesses of the dataset.
Keywords
The Motivation for Protest Data on Myanmar
After decades of military dictatorship, Myanmar has recently seen unprecedented socio-political change. Under the tenure of the semi-civilian government of President Thein Sein (2011–2016), thousands of political prisoners were pardoned, controversial mega projects were suspended and pre-publication media censorship was abolished. Moreover, reforms paved the way for the sweeping victory of Aung San Suu Kyi and her party, the National League for Democracy (NLD), in the 2015 general election. Considering the extent of the changes, many political scientists have described Myanmar as experiencing a political transition, even though various constitutional prerogatives have remained in place that safeguard the powerful position of the military (e.g., Egreteau 2012, 2016; Huang 2013; Jones 2014). While the general reform process has received much scholarly attention (e.g., Bünte 2011, 2016; Holliday 2013; Kyaw Yin Hlaing 2012), research on Myanmar's civil society has remained comparatively ‘quiet’ (notable exceptions are Lorch 2007; Petrie and South 2014; Prasse-Freeman 2012). To date, only a very small number of studies have looked at particular social movements and other civil society organisations that have occurred following the reforms (e.g., Chan 2017; Lidauer 2012; Simpson 2013). One could suppose that the gap in research, particularly beyond single case studies, is a result of the purge of the Burmese civil society during decades of military dictatorship. 1
Acknowledgments: I would like to thank Jella Fink, Richard Roewer, Laura Horning, and two anonymous reviewers for valuable feedback and comments on an earlier version of this article.
However, back in 2007 already, when Cyclone Nargis hit the country, it became clear that the civil society had not been entirely “murdered”, as Steinberg (1997: 9) put it; instead, well-organised local groups appeared and carried out aid work that the military prevented international organisations from providing (Lorch 2007, 2008; South 2008). Hence, it is surprising that Myanmar's civil society has not received more scholarly attention. The present project seeks to narrow the research gap by contributing data to the study of the civil society in contemporary Myanmar.
Following Charles Petrie and Ashley South (2014: 86), civil society can be defined as “actors, voluntary associations and networks [that operate] in the space between the family/clan, the state in its various incarnations, and the for-profit market.” These actors in civil society engage, inter alia, in “creat[ing] channels […] for the articulation, aggregation, and representation of interests” (Diamond 2004: 8). One important tool to open these ‘channels’ is protest. Protest assemblies are aimed at “direct action on behalf of collective interests, in which claims [are] made against some other group, elites, or authorities” (Tarrow 1989: 359). Moreover, protest is frequently described as the means of the disenfranchised, simply because little more than people is needed to stage a protest, while the disruptive effect on ‘law and order’ can be tremendous. Therefore, protest is an important proxy to look at a civil society, and, when including the reactions to protest events, insights can be drawn that go beyond the civil society but encompass parts of the state, such as the security apparatus and the judiciary.
While the existing case studies on civil society organisations and social movements cannot be overvalued in terms of the insights they give into their action strategies, little is known about the “bigger picture.” For instance, is protest in one case different from protest in another? Have civil society organisations been given more leeway only in specific areas or across different topics, groups and locations? And if leeway has expanded more broadly, has it grown steadily over time and space or increased in distinct episodes? Answers to these and other questions require data beyond single protest movements.
Unsurprisingly, the lack of research on Myanmar's civil society is mirrored by the lack of protest data. Apart from a small dataset on protests in Rangoon in 1988 (Ferrara 2003), there has been no dataset specific to Myanmar. While protest data is important for researchers studying Myanmar, it is similarly relevant for scholars from different fields. Social scientists have compiled national protest datasets from various countries and times to answer a multitude of questions (see, e.g., Koopmans and Rucht 2002; Rucht and Ohlemacher 1992). However, most of these projects have focused either on single Western democracies or the “new democracies” that evolved after the collapse of the Soviet state (e.g., Beissinger 2002; Ekiert and Kubik 2001). Openly accessible protest datasets from countries in Asia are rare, particularly from authoritarian or transitional regimes. For instance, while various authors have collected rich protest data from South Korea's transition (e.g., Chang 2015; Chang and Vitale 2013; Kim 2009), only the dataset from a small project has been made public (Nam 2006a). The same applies to China (Steinhardt 2016), and other Asian nations. On the other hand, although cross-national datasets (e.g., Banks and Wilson 2017; Jenkins et al. 2012; Leetaru and Schrodt 2013; Nardulli, Althaus, and Hayes 2015; Raleigh et al. 2010) sometimes include Myanmar, their intransparent sources or sole coding of international news outlets means they are of limited use, particularly when analysing single countries at critical junctures, such as in crisis or regime change, and when micro-foundations matter (Nam 2006b). Even though cross-sectional datasets differ in data quality, and promising new data has recently been presented (Weidmann and Espen forthcoming), cross-sectional datasets – by definition – provide more breath than depth.
Because of this lack of in-depth data regarding protest in Myanmar specifically, as well as Asia and cases of democratic transition generally, I decided to initiate the Myanmar Protest Event Dataset 2 in 2014. With the conclusion of the first step of the research agenda, the data has started to be analysed and triangulated (e.g., Buschmann 2018; Hossain et al. 2018) and has become openly available for download. Therefore, this article aims to introduce the Myanmar Protest Event Dataset (hereafter: MPED) to researchers interested in protest data from Myanmar, and also explain its methodology and research prospects.
The first two data releases have been published with the GESIS Institute and are available for download (Buschmann 2016, 2017). Further releases can be found at: www.myanmarprotestdata.org.
The remainder of this article is structured as follows. First, the research design, including the methodology and the code scheme, will be presented. Second, the selection of the primary media source will be justified in light of Myanmar's media landscape in 2011. Lastly, the quality of the compiled data in terms of different biases will be discussed.
The Myanmar Protest Event Dataset
The data from the first version of the MPED was solely collected via a protest event analysis (PEA). A PEA describes a specific type of quantitative content analysis that aims at the systematic collection of information on protest events by using news resources and, ultimately, transforming the information into machine-readable numbers (Krippendorff 2004: 18). A PEA belongs to the standard method-toolbox of social movement research, which is why I do not devote more space to its methodological description here and instead move directly to the dataset (for a review, see Hutter 2014).
Some of the Principal Variables in the Dataset (v.1.1)
Note:
Protests directly related to previously held protests.
In its first version (v.1.x), the MPED consists of data on N = 185 protest events that were collected from the English online newspaper articles of The Irrawaddy (as primary media source), published between 4 February 2011 (the election of President Thein Sein) and 31 December 2014 (see Table 1 for an overview of selected principal variables). The primary media source was thoroughly crosschecked with the English language versions of The Myanmar Times and The Global New Light of Myanmar. While The Myanmar Times is another popular and privately run newspaper, The Global New Light of Myanmar is government-owned and published by the Ministry of Information. Variables regarding the arrest and charges against protesters were additionally crosschecked with open data available with the Assistance Association for Political Prisoners Burma. Future updates of the dataset will include more primary sources, including Burmese publications.
Instead of continuously adding data, updates to the MPED are released in full versions and separated by distinct episodes. As such, the time-frame covered in the first episode was set to capture the first period of Myanmar's transition. It is surely debatable whether 4 February 2011 is the actual starting point of the transition because the “Roadmap to a Disciplined Democracy”, which laid out seven steps to a multiparty system, had already been revealed in 2003. Nevertheless, since the steps prior to the office-taking of the Thein Sein government were internationally condemned as rigged and ‘un-democratic’ (the steps included the drafting of the constitution, the referendum on the new constitution, and the general election in 2010), I argue that reforms that actually opened up and liberalised the polity were mainly made under Thein Sein.
The time preceding the general election in 2015 until the inauguration of the NLD-government will extend the dataset in the second version, scheduled to be finalised in the near future. Protests that took place from March 2016 onwards will follow too.
How “Protest” is Defined
Most PEA studies define “protest” in broad terms and include actions that range from artistic performances or poetry to hunger strikes (Tarrow 1989). For the current project, however, only data from the most visible forms of protest (demonstrations, protest marches and labour strikes) was collected. The practical reason for this is that the accuracy of information on non-public or subtle forms of protest is extremely difficult to validate/crosscheck, particularly in an authoritarian context such as Myanmar. Hence, a narrower definition of protest was chosen. In order to capture all events that the authorities considered to be protest assemblies, the legal definition of “protest assembly” was followed. According to Myanmar's law, a protest assembly is “a gathering of more than one person, […] for the purpose of expressing their wishes and convictions” (Section 2(b), Pyidaungsu Hluttaw Law No. 7/2011). This definition intentionally excludes events that can be described as “mobs”, which are assemblies of people that do not constitute to express their “wishes and convictions” but only to conduct collective physical violence (examples would include the Buddhist mobs targeting Muslim shop owners in Mandalay in 2012). Nevertheless, if protest assemblies turned violent in the course of a gathering, for whatever reason, they were still considered to be protests.
Researchers working with the dataset should be aware that the definition as set by the legal code allows for a significant amount of interpretation on side of the authorities. One implication for those working with the dataset is that other definitions of protest assembly from the literature, particularly those emphasizing the number of participants and the purpose of the assembly, do not necessarily match the sample. In fact, some of the protest assemblies in the dataset might have been only labelled as such by the authorities to apply protest policing laws.
The Coding Scheme
To allow for comparisons with other datasets, the coding basis of variables was set by the code scheme of the research project “Prodat”, which was conducted by Dieter Rucht at the Wissenschaftszentrum Berlin. Similar in its approach, it collected protest event data from post-war Germany (Rucht 2010). Since Prodat targeted a variety of protest forms, some variables that did not make sense for the present study were dropped for the MPED.
By following Prodat, the variables were clustered according to time and spatial distance to the actual protest event, beginning with variables regarding the event (protest basis, mobilisation, participants, problem/issue/topic, and direct context) to the aftermath (mediate and long-dated context)). The fourth column in Table 2 represents the percentage of empty data fields in each specific variable group (as in v.1.x). While it does not say anything about the quality of the data or individual variables in that group, it depicts quantitatively which information was least often reported in the reports (and, therefore, could not be coded). For researchers who want to utilise the dataset, it indicates which groups of variables might need to be enriched with information from other sources (assuming all variables in the specific group are of importance) and which variable groups have most potential to be used for analysis.
The Selection of the Primary Source
When choosing a news source for conducting a PEA, it is crucial to keep in mind the circumstances of the media landscape and their consequences for later interpretation. Despite today's vibrant media landscape, with several independent newspapers and broadcasters in Myanmar, the media is still not free. In August 2012, the former censorship agency, the Press Scrutiny and Registration Department, announced the abolition of pre-publication censorship (BBC 2012). However, critical stories related to sensitive topics, such as the military, the intelligence service, or Buddhist nationalism are seldom featured. Furthermore, formalised censorship has been replaced by self-censorship, and many papers barely meet international standards on media ethics and quality journalism. 3
This information stems from anecdotal evidence drawn from personal talks of the author with journalists, scholars and NGO employees from Myanmar at different occasions. The information was confirmed by a roundtable discussion organised by the Panther-Stiftung, which took place on 30 November 2014 in the taz-café in Berlin.
Variable Groups and their Contents
Note:
Only those variables that are not conditional on another variable's attribute.
INI and RECIP (almost no information) have a high leverage here.
Reports seldom say that no intimidation was present, hence the value is screwed by the many “N/A.”
Some variables exist more than once (for instance, LOC1-3), see the description on the code sheet.
Daily printed newspapers, as used in most earlier PEA studies, were not usable for this study. Independent newspapers emerged in Myanmar only after the abolition of pre-publication censorship in August 2012, which previously made the day-to-day publication of newspapers impossible. 4 However, the period of time this study aims to observe started in February 2011. Moreover, concerning the underlying research design of this project, the range of news outlets that came into question for the first version was further limited by language and format restrictions. Neither media outlets solely publishing in Burmese nor those that are only released in print version could be considered. Overcoming the language difficulties would have been beyond the scope of the first stage of this project and is only expected to follow in later updates of the dataset. Nevertheless, the best-practice selection criteria as set by previous research on PEA (see e.g., Hutter 2014), can also be met by using an online news outlet.
Other English online news outlets from Myanmar existed before 2012 but were not only very restrained in their work (The Irrawaddy was in exile at the time) but also less popular than The Irrawaddy.
In February 2011, only a few outlets had already published continuously online and in English. The government-run newspaper The Global New Light of Myanmar, the private The Myanmar Times, and the three exile outlets Democratic Voice of Burma (DVB), Mizzima and The Irrawaddy. The Global New Light of Myanmar could not be considered since, in a trial sampling in 2011, not even major protest events were covered. The Myanmar Times was not preferred for its unclear political standpoint and controversial ownership situation in 2011 that could have had a changing influence on selection and description biases (Mizzima News 2011). The Irrawaddy was preferred against Mizzima and DVB for its larger audience and the far larger number of articles it published throughout the years. Another well-known news outlet, Eleven Media, did not have an English website prior to 2012 and therefore could not be used as primary source. The English online version of the The Irrawaddy was found to be the most suitable news source and matched the selection criteria as set out by Hutter (2014) to maximise the sample size and minimise potential biases, aiming at maximising the representativeness and reliability of the data:
1. Continuous publication/2. Daily publication: The Irrawaddy was founded in Bangkok in 1993 as a continuous monthly magazine. The English news website was launched in 2000 and the Burmese version was launched in 2001. In 2011, state censorship was lifted and The Irrawaddy website became accessible for internet users in Myanmar. Since 2013, the monthly print journal could be distributed across the country, which was replaced by a weekly print journal in 2014 (The Irrawaddy 2016). By the time that the liberalisation of media laws began in 2011, The Irrawaddy had been established for a long time. In this period, online articles were already published on a daily basis, which matches the continuity criterion.
3. High quality: The Irrawaddy is an internationally known and award-winning Burmese news outlet and can therefore be presumed to be more in accordance with international press standards and more credible than many other potential sources (see, for instance, CPJ 2015). Additionally, it is considered one of the most popular Burmese news sites (it currently ranks fourth in social media impact after 7Day News Journal, Eleven Media, and the [new] BBC Burmese, see Socialbakers 2016). However, since the NLD-government has been in office, the international reputation of The Irrawaddy has plummeted for its allegedly unbalanced pro-government coverage. In November 2017, an American outlet headlined: “Why is the U.S. Government Funding Anti-Rohingya Propaganda?”, criticising the financial support The Irrawaddy receives from US agencies (Carrol 2017). For future data versions, the media landscape will need to be reassessed.
4. Comparability with regard to political orientation: The Irrawaddy is, to a great extent, funded by Western donors (The Irrawaddy 2016). This implies close ties with the Western governments-backed opposition party, the National League for Democracy (NLD), led by Daw Aung San Suu Kyi. Thus, the political orientation is clearly on the side of the (then) opposition and can, hence, be controlled.
5. Coverage of the entire national territory: The Irrawaddy claims to be a nationwide news outlet (The Irrawaddy 2016). But Myanmar lacks a unified nation-state in the periphery due to the existence of at least 135 ethnicities, the geographical situation and many ongoing conflicts (Callahan 2007; Smith 2007; Walton and Hayward 2014). This makes coverage of the entire national territory difficult and adds to the existing difficulties resulting from poor telecommunication systems. Nevertheless, the dataset also includes coverage of protest events in peripheral regions, which shows that information from relatively remote areas still make it into the reports (see Figure 1).
The Coding Procedure
The news articles were downloaded in February 2015 with the software SiteSucker for Mac OS X V.2.3.6, developed by Rick Cranisky. The software downloads websites by asynchronously copying the site's web-pages and files. In total 152,145 files were downloaded.

Geographical Distribution of Protest Events
Secondly, all articles from the website sub-category “Burma” (Myanmar) were converted to PDF files to make them immune to any form of changes in the content. Thirdly, these PDF files were electronically searched for the following keywords: protest, gathering, demonstration, march, assembly, strike (each in singular and plural). In total 3,848 relevant articles were found. Fourthly, the relevant articles were coded individually according to the code scheme and the data assembled in a CSV file. Many protest events were followed up by more than one article, and many articles did actually not contain information on domestic protest events, which explains the difference between 3848 articles and only 185 events in the datasets (in the versions v.1.x). Whenever an article clearly stated that it reported on a previous protest, which had already been coded, the first protest entry in the dataset was altered with the most recent information. If a protest was linked to another protest but both were staged separately from each other, each protest was also separately coded but the link between both was marked in the dataset (see code sheet, 5 variables SERIAL and FIRSTPEN).
The code sheet can be found online, see footnote 2.
Potential Biases
From February 2011 until December 2014, The Irrawaddy covered 185 protest assemblies. Although this is not a small number, it is also neither a fully representative sampling nor a full sample and various selection biases may be present. In order to reach valid conclusions, it is important to know what predicts whether an event is covered in the source or not and, if biases exist, whether they are stable over the observed period of time. Although a discussion on biases is highly dependent on the research question and the intended use of the protest data, a general discussion will be presented below.
Studies have argued that news outlets report selectively (“selection bias”) about events or report erroneously about information on events (“description bias”) (Earl et al. 2004; McCarthy, McPhail, and Smith 1996). Incorrect coding is another potential source of bias (“researcher bias”) that must be taken into account. All three sources of biases will be discussed in the following.
Selection Bias
An exile media origin suggests a generally reduced selectivity due to a higher newsworthiness and accuracy of depicting protest events, which will be examined subsequently. Earl et al. (2004) pointed to three factors that generally effect selection bias: (1) event characteristics, (2) news agency characteristics, and (3) issue characteristics.
Event characteristics: As a rule of thumb, the bigger and more violent the event, the more likely it is to be covered. In the uncertainty of Myanmar's political transition (in early 2011), protests can be assumed to be genuinely newsworthy even without violence and a small number of participants. This might have changed with increasing liberalisation and needs to be controlled in an analysis.
News agencies: If news wires are present where an event takes place, the likelihood of coverage is higher. As the first international news wire, Associated Press (AP), opened a bureau in Myanmar in 2013, a selection bias caused by the news agency supply of protest event news is unlikely – at least until 2013 – simply because no news agency existed. From 2013 to January 2016 The Irrawaddy published only 589 articles from AP throughout all theme categories. Associate Press contributed only a small number of articles to The Irrawaddy and only for about the last one and a half years covered by the first version of the MPED. However, later versions of the dataset need to reassess the situation.
Issue characteristics: The more public interest an issue represents, the more likely it is that an event related to it will be covered. Once information about a protest reached The Irrawaddy in its exile (until 2012), it was surely newsworthy, regardless of the topic. To report about a protest as completely as possible could be also more favourable for exile media, for two reasons. First, because even ‘hard facts’ such as the duration of a protest say something about the repression in the home country regardless of missing ‘soft facts’. Second, the costs for staging protest events in repressive environments are high, which makes every protest noteworthy. This speaks generally for a less biased selection. However, one could expect protests that are not in favour of the Burmese majority state to be less often reported. In fact, the dataset even includes protests against the Tatmadaw's operations against ethnic minorities, such as the Kachin.
Description Bias
Exile media, like opposition media in general, may have a stake in intentionally leaving information out or misrepresenting information with the aim of libelling the current regime. It is improbable that a protest event itself was faked, as this would damage a news outlet's credibility sooner or later. Nevertheless, single variables may be over- or underestimated, which poses a source of description bias. Description bias refers to incorrect information about a covered protest event. Schweingruber and McPhail (1999) differentiated between hard news about the “who, what, when, where, and why of the event”, which is “in general, accurate, indicating that missing data may be the most serious form of description bias”, and soft news, which is “subject to multiple sources of bias” (Earl et al. 2004: 73). Soft news is more detailed information about, for example, the exact claim or number of people involved. Such soft news may be skewed by the political standpoint of the source and, thus, conceivable in the case of The Irrawaddy. To improve the reliability of the data, especially of the soft news, thorough crosschecks were conducted with Myanmar Times and The Global New Light of Myanmar (after 2012). Variables regarding the arrest and charges against protesters were crosschecked with open data available with the Assistance Association for Political Prisoners Burma without inconsistencies found.
Researcher Bias
Researcher bias refers to failures in the selection and coding of data. This source of bias can be suspected to be as minimal as possible because the author of this study was the only coder of the first version. Hence, although a bias is not absent, the existing bias can at least be suspected to have remained steady.
It was argued that the chosen primary media source copes well with the selection bias and the most urgent distortions in characteristic attributes would predominantly affect soft variables. Nevertheless, a description bias due to the exile origin of The Irrawaddy is likely, particularly in 2011. This hints at the fact that one ought to expect different bias severities among the variable groups (see Table 2). Nevertheless, basic protest data (hard news) might be less biased than, for instance, information on the spatial mobilisation of protesters (soft news).
Conclusion
This article introduced the Myanmar Protest Event Dataset as a new resource for researchers working on Myanmar, Southeast Asia, and/or political transitions, who are looking for a large sample on protest assemblies from 2011 onwards. One of the strengths of the dataset is its extensive set of variables that assembles even, in time and space, distanced information on protest events (such as court cases and policy outcomes of protests). The examinations of weaknesses pointed at a potentially high “description bias” due to the pro-NLD standpoint of the primary source, which should be kept in mind when working with the dataset. However, the severity of a description bias differs among variables, which underlines the usefulness of the introduced variable clusters that allows for the control of individual groups of variables.
