Sage Journals: Discover world-class research

Abstract

The fallacy of premature designations such as “Iran's Twitter Revolution” can be attributed to the empirical gap in our knowledge about such sociotechnical phenomena in non-Western societies. To fill this gap, we need in-depth analyses of social media use in those contexts and to create detailed maps of online public environments in such societies. This paper aims to present such cartography of the political landscape of Persian Twitter by studying the case of Iran's 2013 presidential election. The objective of this study is twofold: first, to fill the empirical gap in our knowledge about Twitter use in Iran, and second, to develop computational methods for studying Persian Twitter (e.g., effective methods for analyzing Persian text) and identify the best methods for addressing different issues (e.g., topic detection and sentiment analysis). During Iran's 2013 presidential election, three million tweets were collected and analyzed using social network analysis and machine learning. The findings provide a more nuanced view of the political landscape of Persian Twitter and identify patterns in accordance with or in contrast to those identified in the English-speaking Twittersphere around the 2013 presidential election. Persian Twitter was dominated by micro-celebrities, whereas institutional elites dominated English discourse about Iran on Twitter. The results also illustrate that Persian Twitter in 2013 was predominantly in favor of reformists. Finally, this study demonstrates that sentiment analysis toward political name entities can be used efficiently for mapping the political landscape of conversation on Twitter.

Keywords

Twitter Iran social network analysis political landscape computational methods Big Data

This article is a part of special theme on Social Media & Society. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/social-media-society.

Introduction

On 15 June 2009, millions of Iranian citizens marched through the streets of Tehran to protest the outcome of the disputed presidential election in which Mir-Hossein Mousavi, a reformist candidate, was running against the incumbent conservative candidate Mahmoud Ahmadinejad. The next day, the Washington Times wrote an editorial titled “Iran's Twitter Revolution.”¹ The fallacy of this description has been thoroughly discussed in subsequent scholarship (Howard, 2010; Morozov, 2011). However, during the following uprisings in Iran's Green Movement, and later during the Arab Spring, many political elites and scholars overemphasized the role of social media. While many young Iranians used social media—mainly Facebook—in 2009 (Howard and Hussain, 2012; Khazraee and Losey, 2016), Twitter did not play any role in mobilizing people in Iran (Christensen, 2011; Morozov, 2009). One reason for misunderstandings such as this is an empirical gap in our knowledge about such sociotechnical phenomena in non-Western societies. To fill this gap, we need in-depth analyses of social media use in those contexts.

Social media scholarship is an active area of research, but most of this research has focused on Western societies. There are fewer studies that address the international uses of social media. While there is a growing body of literature about social media use in Iran (Faris and Rahimi, 2015; Khazraee and Losey, 2016; Wojcieszak and Smith, 2014; Wojcieszak et al., 2012), there are still major gaps in our understanding of the political landscape of Persian Twitter. As with any unknown area, to better understand what exists in a domain and how things are related to each other, we need to map its landscape. In the absence of empirical maps, there is always a danger of projecting our impressions, as happened in the case of the so-called Twitter Revolution. However, to create empirical maps, we need methodically and rigorously collected evidence. Data collection through public opinion surveys and other traditional social science research methods is highly restricted in countries such as Iran. Fortunately, recent technological developments and the widespread use of social media enable us to use digital trace data (Jungherr, 2015) and Big Data analytics to create detailed maps of social interactions in online public environments, which can better represent those communities.

This paper aims to present such a cartography of the political landscape of Persian Twitter by studying the case of Iran's 2013 presidential election. Its objective is twofold. First, as mentioned above, it aims to fill the empirical gap in our knowledge about Twitter use in Iran. Second, it aims to deploy computational methods for studying Persian Twitter and identify effective ones for addressing different issues (e.g., topic detection and sentiment analysis). The purpose of this study is to illustrate an approach to fill both the empirical and the methodological gaps in studying social media use in Iran using big social data and computational methods. To that end, this paper provides a more nuanced view of the political landscape of Persian Twitter discourse and identifies patterns in accordance with or in contrast to those identified in the English-speaking Twittersphere around the 2013 presidential election. It also discusses the mechanisms of information diffusion in that landscape.

Literature review

Social media as networked publics in Iran

boyd (2011) defines networked publics as “publics that are restructured by networked technologies” (p.39). She argues that as we move our social lives more and more to networked publics, it is critical to understand the affordances, dynamics, and implications of networked publics for social life. Papacharissi (2011) discusses the mutual transformation of individuals, identities, and citizenship through an investigation of the relationship between digital media and democracy. She argues that digital citizenship is civic responsibility enabled by digital technologies. Networked publics serve many of the functions of traditional publics, including civic engagement and connecting people to others beyond their familial or physical circles; however, they also introduce new affordances such as replicability, scalability, and searchability (boyd, 2011). Public space can be defined as a physical or social phenomenon (Lefebvre, 1991). Following the notion of the social construction of space, under repressive cultures, social media create opportunities to form social spaces for deliberation, functioning as short-term public spheres that enable discussions of sensitive topics and the presentation of dissenting views (Rauchfleisch and Schäfer, 2015). Social media have also emerged as part of repertoires for contentious politics (Khazraee and Losey, 2016). In Iran, social media created such public spaces in different forms, mostly to fill the absence of free public spaces (Faris and Rahimi, 2015; Khazraee and Losey, 2016).

Iranians were among the early adopters of social media. In the 2000s, the Iranian blogosphere developed as a national and international platform for vibrant discussions around diverse topics, including politics. “Blogestan,” as it is often called, emerged as a leading platform for user-generated content. Blogestan also offered a central platform for contentious issues such as politics (Faris and Rahimi, 2015; Kelly and Etling, 2008; Khazraee and Losey, 2016). Kelly and Etling (2008) conducted one of the earliest studies investigating the landscape of social media use in Iran by mapping Blogestan. Employing social network analysis (SNA) and automated content analysis, they created a high-resolution image of thousands of Persian blogs. The outcome of that study revealed the gap in prevalent beliefs about Blogestan and demonstrated how empirical research using computational methods can change our understanding of online publics. Prior to Kelly and Elting's work, it was widely assumed that Blogestan was dominated by pro-reformist bloggers. However, their study illustrated that Blogestan was relatively evenly divided between reformist and conservative bloggers. While there is a growing body of work on social media in Iran, there is still a pressing need for further studies along the lines of the work done by Kelly and Etling, using big social data to investigate the landscape of online publics in Iran, including different social media outlets.

Twitter as a medium for political communication

With its 328 million monthly active users communicating in over 40 different languages (Reuters, 2017), Twitter has become a popular medium not only for personal information sharing and interpersonal communication but also for more structured collective activities during social and political events. Twitter allows users and their communications to be visible and identifiable by others; it allows users to connect by using common hashtags, mentioning one another, and replying to one another; and communication flow can be internally coordinated by the members of a self-established communication network (Wang and Chu, 2017).

Twitter's role in political communication in different parts of the world, especially around elections and campaigning, has increasingly been studied in recent years (Weller et al., 2014). Such studies have generally examined the use of Twitter as a communication channel for social protests (Penney and Dadas, 2014), political election campaigns (Enli, 2017; Heo et al., 2016; Jungherr, 2016; Jungherr et al., 2015; Kim et al., 2016; McGregor, 2018; Ott, 2017; Park et al., 2016), and revolutionary movements during political upheavals (Lotan et al., 2011). In these contexts, Twitter has been used either by politicians (parties and candidates) as a platform for self-representation, or by citizens or “Twitter publics” (Jungherr, 2016), as a communication tool for arranging social and political movements (Anstead and Chadwick, 2018; Ceron, 2017; Engesser et al., 2017).

Some scholars have focused on the use of Twitter around party politics. Dubois and Gaffney (2014) identified the most influential political players within the two largest Canadian political communities on Twitter through a comparative study of different SNA measures of indegree, eigenvector centrality, clustering coefficient, knowledge, and interaction. They revealed that political elites such as media outlets, journalists, and politicians were most influential; political commentators and bloggers were somewhat less influential; and opinion leaders were still less influential, mainly influencing those in their immediate personal networks. Another study focused on two cases: the Dutch parliamentary Twittersphere, which reveals “functional interactions between professional elites,” and German Twitter Twittersphere, which reveals “two loosely connected networks with quite different core interests: net politics and fun” (Paßmann et al., 2014: 333). The authors argue that users' behaviors are dependent on their particular socio-cultural contexts and discuss how such behaviors contribute to the way retweeting, favoring messages, and circulation of Twitter messages occur in general.

Other scholars have focused on how ordinary users engage with contentious politics on Twitter. These studies paid attention to news creation and communication flows, and key players controlling it. Papacharissi and Oliveira (2012) studied #egypt communications on Twitter during the 2011 Egyptian uprisings leading to the resignation of Egyptian President Mubarak. In this large-scale study, the authors explored four prominent news values specific to Twitter that made this microblogging site a platform for turning events into news stories in a context where access to mainstream news was restricted or controlled by the government. The news values were instantaneity (retweets and requests for instant updates), crowdsourced elites (participation of key bloggers, activists, and informed citizens), solidarity (communication of words of solidarity), and ambience (creation of an ambient information-sharing environment). Morales et al. (2012) examined the 2010 #SOSInternetVE Twitter protest, which was organized to support freedom of information on the internet in Venezuela, through complex network analysis. They identified retweets as the main mechanism for information propagation in such environments. Lotan et al. (2011) focused on the Twitter communication flow during the 2011 Tunisian and Egyptian uprisings and identified the key actors of the network to be mainstream media organizations, individual journalists, influential regional and global actors, and other ordinary users who actively tweeted posts related to these two revolutions. González-Bailón et al. (2011) studied Twitter political communications related to the 15-M movement in Spain in 2011 and identified two groups of key users: early participants, who were the activists leading the protests, and those who were influenced by this group and joined at a later time but eventually became the more central actors in the network in disseminating information. They concluded that users who acted as seeds of message cascades (i.e., the spreaders of information) held more central positions in the network.

Heo et al. (2016) investigated the role of social media including Twitter in relation to traditional media during Seoul's 2011 mayoral election. They concluded that social media users actively participate in the production and restructuring of political discourse beyond what is presented by traditional media. The same authors also studied Twitter issue networks formed around the 2102 presidential debates in South Korea (Park et al., 2016). Their findings show that political discussion around elections forms homogeneous communities centered on political ideologies and geographic locations. They also demonstrate that the landscape of Korean Twitter during televised debates favored the opposition candidate. In a similar study, Kim et al. (2016) discuss the role of Twitter in a hybrid media system in the context of elections. They discuss how Twitter was used for direct interaction among users of social media and the listeners of a popular political podcast. Such studies highlight the need for further research focusing on non-Western contexts.

Participation in online publics through Twitter can also contribute to the formation of communities of like-minded users that can be delineated across ideological, political, or other personal attributes (Bäck et al., 2018; Bruns and Burgess, 2011; Gruzd et al., 2011; Stephansen and Couldry, 2014). Bäck et al. (2018), studying xenophobic online forums, argue that participation in such online communities can contribute to a shift from an individual identification to a group one over time. Gruzd et al. (2011) discuss how social media make multiple community memberships possible and help people to form new connections while maintaining their existing ones. They argue that Twitter can form the basis of interlinked personal communities. Bruns and Burgess (2011) argue that political discussions on Twitter lead to the formation of different communities, sometimes because of preplanning and sometimes as an ad hoc result of discussion and deliberation. Discussing community formation on Twitter, Park et al. (2016) argue that communities on Twitter reflect ideological or geographical communities. Therefore, studying community formation on Persian Twitter can also be key to understanding how users interact in that space and how that relates to different political orientations.

Twitter in Iran

Twitter received considerable media attention in connection with the 2009 protests, despite a low user base and low use within Iran (Christensen, 2011; Wojcieszak and Smith, 2014). As Christensen writes, “[t]he protests were labeled by some as a Twitter Revolution, despite the fact that there were just over 19,000 Twitter users in Iran out of a total population of just under 80 million” (2011: 238). However, the vast majority of Twitter's users tweeting about the protests were not based in Iran (Howard, 2010). Twitter was blocked preceding the 2009 elections in Iran, and, aside from a brief reprieve in 2013, has remained blocked since. Later, a study of young Iranians' use of social media confirmed that in 2009, Twitter played a minimal role in the political engagement of that demographic, which was the most technologically savvy population in Iran (Wojcieszak and Smith, 2014). However, Twitter did play an important role as an international platform for disseminating news about politics in Iran through popular hashtags such as #IranElection (Howard, 2010; Mottahedeh, 2015).

While some scholars have examined the use of Twitter in Iran (Christensen, 2011; Faris and Rahimi, 2015; Howard, 2010; Wojcieszak and Smith, 2014), there have been few studies using computational methods investigating the digital trace of users' interactions on Twitter. Shortly after the disputed 2009 election in Iran, the Web Ecology Project (2009) published a report of users' interactions on Twitter over the course of 18 days around the election (7–26 June 2009). The report investigated approximately two million tweets and provided a statistical overview of the population and activities during that period, and identified influential users, taking into account both numbers of original tweets and numbers of followers. However, the report did not go beyond a descriptive analysis of the data set. In another study, Khonsari et al. (2010) focused on the 2009 Iranian presidential elections, and through an analysis of tweets posted by opposition groups, revealed two main connected clusters in the network: people who opposed the government (70% of the nodes) and people who supported it (24% of the nodes). However, the authors did not provide details on how they identified political affiliations and whether the users they studied were Persian-speaking users only. Another study of Iran's 2009 election (Zhou et al., 2010) investigated the dynamics of information propagation during that period by analyzing three million tweets related to the Iranian election posted by approximately 500 thousand users during June and July of 2009. The authors initiated their data collection by studying the hundred most influential users provided by the Web Ecology Project (2009) and investigated a follower–followee network of 20 million users to identify tweets related to the Iran election. The paper investigated the structure and mechanisms of information propagation on Twitter and illustrated that the information cascades are shallow and that the depth of 99% of the cascades is less than 3. The authors also claimed that trending and the search bar play important roles in defining the retweet behavior of users beyond the follower–followee networks. While these studies provide some information about political engagement on Persian Twitter, there are still gaps in our understanding of how the political landscape is divided among different political orientations and the structure of information diffusion processes in this landscape. Present study, first, aims to improve our understanding of the political landscape of Persian Twitter. It is also an attempt to fill the existing gap in the use of computational methods for studying the Iranian Twittersphere. We aim to use big social data collected from user interactions on Twitter to present a more accurate mapping of the political landscape of Persian Twitter (similar to Kelly and Etling's study (2008) on Blogestan). The 2013 presidential election is a particularly significant context for studying Twitter use in Iran, since it followed a strict ban on Twitter use after the 2009 election.

Data collection and methods

Data collection

We used the Personal Zombie application (Black et al., 2012), which uses the REST API search feature to collect statuses containing hashtags and keywords occurring within the last seven days. Personal Zombie can collect all tweets in the search index going backward in time to the point of the application's last run by matching the last-retrieved tweet ID. This strategy worked well with the size of retrieved data and did not face the challenges posed by Twitter's rate limit. It also enabled us to retro-collect data when we noticed a new trend or hashtag.

Many Iranian users use English hashtags for their Persian tweets (e.g., the popular #IranElection); therefore, we included both Persian and English keywords in our effort to collect a more inclusive data set. We began data collection with a set of initial keywords, including the names of Presidential candidates in Persian and English (with multiple possible spellings in English). Every day during the data collection period, we monitored the data set to identify emerging keywords and hashtags and added them to the data collection to backtrack them using the affordances of REST API. At the end of the data collection, we had a total of 47 keywords and hashtags (see Appendix 1).

Because our goal was to map the political landscape of Twitter around Iran's presidential election, we decided to collect data both before and after the election, in order to capture interactions of users in both periods. We collected data between 13 May 2013 and 29 June 2013 (four weeks before and two weeks after the election). At the end of our data collection period, we merged all collected data sets for each keyword and dismissed duplicate tweets that were collected multiple times for different keywords. The data collection yielded 3,006,528 tweets about Iran and the election, including 460,008 tweets in Persian.

Data analysis

After data preparation, which included removing duplicates and normalizing Persian text,² we processed and analyzed our data sets in two tracks. In the first track, we used SNA to understand the structure of communication networks on Twitter during the presidential election as well as to analyze the mechanisms of information diffusion (Figure 1). In the second track, we applied Natural Language Processing (NLP) for textual analysis to better understand the nature of discourse around the presidential election and the political landscape of the Persian Twittersphere (Figure 2).

Figure 1.

First track of data analysis focused on social network analysis.

Figure 2.

Second track of analysis focused on textual analysis.

We focused our SNA on the retweet network because retweets can be used to trace information propagation among users (Arnaboldi et al., 2014; boyd et al., 2010; Pezzoni et al., 2013). We analyzed the retweet networks of Persian and English tweets in two separate data sets to identify differences between the information diffusion practices and the structures of these two networks.

To identify the most visible and the most influential users in retweet networks, we used indegree and eigenvector centrality measures respectively. Indegree centrality identifies users who received the highest number of retweets (most visible). Generally, degree centralities assume that nodes with more connections are more important. However, in the real world, having connections with more important nodes can be more important than having many connections. Eigenvector centrality is a measure that tries to incorporate the importance of neighbors and their connectedness (most influential). Therefore, eigenvector centrality can be used as a measure of influence (Zafarani et al., 2014). To understand the power structure in the information diffusion networks and the difference between Persian and English Twitter conversations about Iran, we identified the 100 most visible (indegree centrality) and most influential (eigenvector centrality) users in both networks and classified them into four main categories including social media celebrities which is defined based on Senft's (2008) concept of micro-celebrity (Table 1).

Table 1.

Categories used to code the most influential users.

Category	Description	Example
News/media outlets	Official news agencies	@cnnbrk (breaking news from CNN)
Journalists	Individuals who work as journalists for official news agencies or media outlets or as freelance journalists	@Gesfandiari (Radio Free Europe)
Politicians	Official twitter accounts of politicians	@Rouhani_ir
Social media (Twitter) celebrities	Popular users who tweet from inside or outside Iran and famous Iranian bloggers.	@Vahid (Known as Vahid Online)

We sought to determine which of these categories was the most influential during the election by investigating which users held central positions in the retweet network. In order to understand how communities form around the diffusion of information on Twitter, we also conducted a community detection analysis on the retweet network. To this end, we used the “fast unfolding of communities in large networks” algorithm, also known as the Louvain method, for community detection (Blondel et al., 2008). This algorithm has been identified as one of the most efficient methods for this purpose (Emmons et al., 2016; Yang et al., 2016). We then collected information about the most influential community members from their Twitter profiles and tweets to understand the composition of major communities and their political orientations. To better understand how different communities are formed in relation to each other in the Persian retweet network, we used the forced-based algorithm Force Atlas 2 (Jacomy et al., 2014) to visualize the result of community detection analysis (Figure 5).

Figure 5.

Force-based visualization of retweet network reveals the isolated MEK community.

In the second track of analysis (Figure 2), we used machine learning and NLP methods for language detection, sentiment analysis, topic detection, and sarcasm detection. The data processing workflow starts with a language identifier to detect the language of each tweet. Language detection for short texts can be problematic and inaccurate (Bergsma et al., 2012). To address this problem, we trained the Language-Aware String Extractor package³ (Brown, 2012) to recognize Arabic, English, French, German, Persian, Spanish, Turkish, and Urdu. We achieved 98.10% accuracy on our test data set. Next, deployed supervised learning to train a classifier to identify subjects of Persian tweets. Finally, we detected the presence or absence of candidate(s) in those tweets. If the subject of a tweet was politics and there was at least one candidate's name in the tweet, we used the classifier to detect sarcasm, sentiment, and political orientation.

We used Stanford Classifier⁴ (Manning and Klein, 2003) for the rest of the supervised learning process of textual analysis of Persian tweets, including sentiment analysis, topic detection, and sarcasm detection. In order to conduct the supervised learning tasks, we created a training data set by annotating a randomly selected set of tweets. We recruited three bilingual annotators fluent in Persian and English and assigned the same data set of 300 randomly selected tweets to all three annotators for initial annotations. Then, the three annotators discussed their work to achieve agreement on the process of annotation. Next, we assigned 1200 tweets to each annotator (200 shared between all three). The annotation resulted in 3500 annotated tweets, from which 500 were annotated by all three annotators. For each tweet in the data set, annotators identified the following items, shown in Table 2.

Table 2.

Summary of data annotation tasks for training machine learning classifier.

Annotation category	Task
Subject	Is the tweet related to politics?
Political figures	Does the tweet text mention any of the eight candidates?
Sentiment	What is the sentiment toward the candidate (for, against, or neutral)?
Sarcasm	Is there a presence or absence of sarcasm?
Political orientation	Is the tone conservative, opposition, reformist, or unknown?^a

The political affiliations in Iranian politics can be confusing. There are two major political coalitions inside Iran, known as the Reformists () and the Conservatives or Principlists (). The dissident groups residing outside Iran are broadly known as the Opposition in Persian political nomenclature. We applied the same categories for classification of political orientation. The Reformist movement in Iran formed after the election of Mohammad Khatami as president in 1997. Reformists in Iran are mostly supported by younger and urban demographics. The Reformist movement supports a more open sociocultural atmosphere in Iran while reducing tension in international relations. Conservatives support the status quo and the strict rule of religion, and usually appeal to more religious demographics and populations with lower socioeconomic status.

Sentiment analysis and opinion mining have been subject to vast research endeavors (Martínez-Cámara et al., 2014; Tripathi and Naganna, 2014). Two main approaches have been used for sentiment analysis on Twitter: unsupervised (or lexicon-based) methods and supervised methods (using a manually coded training data set). In this study, we did not use a lexicon-based method, for two main reasons. First, we did not have access to a reliable dictionary for term sentiments in Persian. Second, and more important, the use of dictionaries for sentiment analysis has been criticized for missing nuances (Grimmer and Stewart, 2013). Furthermore, the presence of sarcasm and the limited size of the text in each tweet raised concerns that the lexicon-based approach would not yield adequate results for sentiment analysis on Twitter data. In order to conduct sentiment analysis, we used N-gram features (unigram and bigram) for the training of the Stanford Classifier on our manually coded training data. The training resulted in 84.20% overall accuracy of sentiment detection of Persian tweets according to our 10-fold cross validation. We then applied this classifier to detect the sentiment toward various candidates. To do this, we first identified tweets that included a candidate's name, and then we predicted the sentiment toward that name entity. In this way, we were able to gauge the overall sentiment toward a candidate and their political party. We used sentiment toward candidates as a means to map the political landscape of the Persian Twittersphere.

In the process of textual analysis, we also faced some challenges. Formerly, sarcasm detection proved to be very difficult (Rajadesingan et al., 2015). We faced the same challenge in the present study. Our results show high precision and low recall in sarcasm detection. Therefore, we could not use the results of supervised learning reliably. The results of our predictive model for identifying political orientation from tweets also did not yield acceptable results. Table 3 presents the results of our textual analysis.

Table 3.

Summary of the performance of textual analysis tasks.

Task	Accuracy	Precision	Recall
Language identification	0.98	Multiclass^a	Multiclass
Subject	0.91	0.90	0.99
Sentiment	0.84	0.60	0.86
Sarcasm	0.48	0.86	0.17
Political orientation	0.33	Multiclass	Multiclass

Precision and recall cannot be reported for multiclass classification.

In the next section, we discuss the major findings of the two tracks of analysis.

Findings

Descriptive analysis

Overall data collected for this project included tweets in multiple languages such as English, Spanish, Arabic, and Persian (Table 4). Persian is the fourth most tweeted language and accounted for only about 15% of the total data set. The vast number of Arabic tweets is associated with the similarity of Arabic and Persian scripts; however, it also shows the presence of a strong conversation about Iran among Arabic-speaking Twitter users. The considerable number of Spanish tweets may be associated with Iran's increased engagement with Latin America in the years prior to the 2013 election and its close ties with countries such as Venezuela.

Table 4.

Distribution of languages in the data set.

Language	Percentage of total data set
English	32.23%
Arabic	27.33%
Spanish	17.55%
Persian	15.30%
Other	7.59%

Beyond languages, it is important to see who tweeted most. The distribution of user activities on Twitter is a long-tail distribution following power law. A small portion of users accounted for the majority of tweets. Only 7% of users tweeted more than five tweets with election-related content during the six weeks of data collection, while 71% of users only tweeted once during this period. This shows that a small portion of users at the time actively contributed to the content creation and dissemination, while the majority of users were mostly consumers of the content. This might be associated with security concerns related to participating in sensitive discussions online in Iran.

Figure 3 illustrates the temporal distribution of tweets per day in three groups of tweets (all tweets, Persian and English tweets, and only Persian tweets). The peaks marked on the graph show the major events during this six-week period, which began with the disqualification of Rafsanjani, a prominent politician and former president, followed by three presidential debates on television, and finally the election. The distance between different lines on the graph shows that only Persian users reacted to the debates, whereas all users reacted to the election results. While analyzing the temporal distribution of tweets, in the frequency of tweets per hour in Persian and English, we identified an anomaly. Figure 4 illustrates the frequency of tweets per hour in two Persian and English. While the number of tweets in English spiked in on election day, the number of tweets in Persian shows a surprising plateau. This anomaly may be associated with proactive censorship practices (see ASL 19 report).⁵ After the 2009 election, the Iranian government used throttling as a mechanism to reduce the efficiency of internet communications. The results of ASL 19 monitoring of internet traffic in Iran show an increase in the number of attempts per user required to use the circumvention tool Siphon on election day. This is a significant finding, because it shows that investigating levels of social media use also reveals patterns of network disruption, and researchers should be aware of such incidents when interpreting their results.

Figure 3.

Temporal distribution of tweets and their connection to events.

Figure 4.

Temporal distribution of Tweets per hour in two data sets of Persian and English Tweets illustrates a surprising plateau on the election day in Persian tweets.

Network analysis

Most visible. First, we analyzed the retweet network of Persian tweets. Our results show that Iranian micro-celebrities had the highest presence among the top hundred most retweeted users in Persian (not listed here due to privacy). The second most retweeted group was journalists. However, there were only eight official news/media outlets among the hundred most retweeted accounts. These media sources were: BBC news in Persian (@bbcpersian); Kaleme, a non-official news source of the Green Movement in Iran (@kaleme); Deutsche Welle radio in Persian (@dw_persian); Manoto Persian TV from London (@ManotoNews); the official Persian channel of the US Secretary of State (@USAdarFarsi); Mardomak, a non-official news source of opposition in Iran (@mardomak); and Radio Farda, Radio Free Europe in Persian (@RadioFarda_). It is important to note that none of the media sources from inside Iran were among the most retweeted accounts. This can be attributed to the Twitter ban in Iran and possible legal consequences of using the social network in Iran at the time of election in 2013. In English tweets—in contrast to the Persian tweets—the most retweeted users were from two institutional elite categories: journalists and news/media outlets, and micro-celebrities were not as prevalent as in the Persian tweet network. Among the top hundred most retweeted accounts, there were 26 news/media accounts, only two of which (@bbcpersian and @MehrnewsCom) tweet in Persian.

Most influential. In Persian retweet network, the only news/media outlet among top hundred influential users was BBC news in Persian (@bbcpersian), which ranks fourth on this list. Most of the users among the top hundred influential users fell into the Iranian micro-celebrity category. In contrast, for the English retweet network, the most influential categories were journalists and news/media. Two of the presidential candidates, Hassan Rouhani (@HassanRouhani) and MohammadReza Aref (@MohamadRezaAref), had high rankings, in the fourth and fifth places.

Our findings demonstrate a structural difference between the Persian and English Twittersphere. Discussion about Iran in Persian on Twitter is mostly dominated by micro-celebrities, whereas English Twitter is dominated by institutional elites (i.e., official media, news agencies, journalists, and politicians). Mascaro (2015) found that the political conversation on English Twitter during the 2012 US presidential election was dominated by institutional elites as well. This difference in power structure shows that Persian Twitter in 2013 was dominated by micro-celebrities which may be attributable to the ban on Twitter in Iran resulted in an absence of journalists and news/media outlets. However, this situation has changed since 2013, because Iran's president, Hassan Rouhani, and minister of foreign affairs, Javad Zarif, actively used Twitter shortly after the 2013 election, which encouraged journalists and news/media outlets in Iran to also use Twitter.

Community analysis. The results of our community detection in the larger data set of English retweet network mostly identified users based on their nationality. However, the result of community detection on the Persian retweet network was more informative. The forced-based visualization of communities, as marked in Figure 5, illustrates the presence of an isolated group in the network, the members of which are highly connected to each other but not connected to the rest of network. Next, we studied the profiles of the users in this community. We identified this community as the MEK.⁶ It is important to note that the users in this group have very high levels of activity (high indegree centrality), but they do not have high eigenvector scores. This means that while they are very active and visible through retweeting each other, they do not have much influence on the overall conversation and the diffusion of information in the network. This finding underlines that the structural position of users should be considered in interpretation of the importance of users' messages. Bruns and Stieglitz (2013) discussed the two categories of the most active and most visible users in an analysis of Twitter networks, arguing that the most active users are not necessarily the most visible users. We want to extend this idea by arguing that even the most visible users are not necessarily the most influential ones. This means that how many times a message is passed in the network via retweets is not as important as who is retweeting that message and what the structural position of that user is in the network.

This finding also shows the affordances of force-based visualization algorithms in identifying notable isolated communities. By looking at such visualization of retweet networks, we may identify communities that are not easily detected through other methods.

Textual analysis

As discussed in the methods section, we achieved success in sentiment detection toward candidate names. Thus, we used the distribution of sentiments toward candidates to measure the distribution of political support for different political groups in Iran. Table 5 shows the portions of detected sentiment expressions for each candidate. The most expressions were toward Rouhani (39.20%) and Jalili (14.47%), who were the preferred candidates of reformist and conservative groups, respectively. Figure 6 shows the distribution of positive and negative sentiments toward different candidates and the ratio of negative to positive sentiments for each candidate. As can be seen from the positive to negative ratios, reformist candidates were far more popular, indicating that sentiment on Persian Twitter predominately favored reformist candidates. Our findings illustrate the different landscape of Persian Twittersphere with the Persian Blogsphere which was evenly divided between the reformist and conservative camps in 2008 according to Kelly and Etling (2008).

Figure 6.

Distribution of positive and negative sentiments toward different candidates.

Table 5.

The result of sentiment analysis for the tweets with mention of candidates' names.

Candidate	Total number of tweets with candidate name mentioned	Positive	Negative	Percent of total expressed sentiments	Positive to negative ratio
Rouhani	101,502	67.66%	13.48%	39.20%	5.02
Aref	24,527	68.50%	14.03%	9.47%	4.88
Qalibaf	22,109	34.36%	48.83%	8.54%	0.70
Velayati	10,754	32.54%	51.57%	4.15%	0.63
Rezaei	18,071	27.06%	59.93%	6.98%	0.45
Jalili	37,478	23.87%	62.45%	14.47%	0.38
Gharazi	29,464	14.61%	74.10%	11.38%	0.20
Haddad Adel	15,028	14.30%	75.35%	5.80%	0.19

One of common issues in digital Persian text processing is replacing with . The former one is an Arabic character incorrectly implemented in lieu of the latter by Microsoft Windows and Apple iOS; therefore, there are two characters with different Unicode values which can be used for writing . On the surface they look similar but different character values. We included both to make sure collecting all related tweets.

One of the methodological contributions of this study in the use of sentiment analysis for mapping the political landscape of Twitter by training a classifier using N-gram features. Such method is proved to be more effective than the detection of the political orientation of tweets or using lexicon-based sentiment analysis methods, which might be problematic (Grimmer and Stewart, 2013).

Discussion

In the present study, we examined the political landscape of Twitter during Iran's presidential election in 2013. We also investigated the information diffusion processes on Twitter during this period, the most influential users, and major communities. First, we found that Persian Twitter had a different power structure from English Twitter during the period of study (our data set included both Persian and English tweets). Persian Twitter was predominately ruled by micro-celebrities—social media users who created a strong base of followers and employed a performance style to increase their popularity among readers, viewers, and those to whom they were linked online. Discussion about Iran on Twitter in English, on the other hand, was dominated by institutional elites such as politicians, journalists, and news/media outlets. This difference in the power structure may be the result of restrictions on Twitter use in Iran creating an environment mostly dominated by micro-celebrities. However, this structural difference may have altered since the election of Hassan Rouhani and the use of Twitter by some government officials, which has encouraged other institutional elites in Iran to use Twitter since 2013.

Second, our results show that the most visible users and messages are not necessarily the most influential ones. In other words, how many times a message is retweeted is not as important as who retweets the message and the structural position of that user in the network.

Our third finding is that different Persian social media platforms might have different political landscapes. The political landscape of Persian Twitter in 2013 was more dominated by reformists, by contrast Kelly and Etling (2008) found that the Persian Blogsphere was evenly divided between reformists and conservatives. This encourages us to study the ecology of platforms since political groups might use different platforms to spread their messages based on the affordances of those platforms.

The methodological contributions of our results are twofold. First, our results regarding the application of NLP show that sentiment analysis toward name entities can be more accurate in mapping the political landscape of conversation on Twitter than detecting political orientation from the content of tweets. Second, we noticed that force-based visualization algorithms can help us identify isolated groups and that analyzing those user profiles can reveal very interesting patterns of communication.

In the future, we would like to evaluate whether the power structure of Persian Twitter has changed since 2013. Such a study can inform if and how recent use of Twitter by Iranian politicians affected the power structure of its communication networks. We also believe that conducting qualitative interviews with influential users and collecting data through online surveys from Twitter users can enrich our perspective on the dynamics of the Persian Twittersphere. To improve our computational tools, we are interested in expanding our training data to improve the effectiveness and accuracy of our NLP toolkit.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes

References

Anstead

Chadwick

(2018) A primary definer online: The construction and propagation of a think tank's authority on social media. Media, Culture & Society 40(2): 246–266.

Arnaboldi

Conti

La Gala

et al .(2014) Information diffusion in OSNs: The impact of Nodes' sociality. In: Proceedings of the 29th annual ACM symposium on applied computing. (ed. Y Cho, SY Shin, S Kim, et al.), Gyeongju, Republic of Korea, 24–28 March 2014, pp.616–621. New York, NY: Association for Computing Machinery.

Bäck

Sendén

et al .(2018) From I to we: Group formation and linguistic adaption in an online xenophobic forum. Journal of Social and Political Psychology 6(1): 76–91.

Bergsma

McNamee

Bagdouri

et al .(2012) Language identification for creating language-specific Twitter collections. In: Proceedings of the second workshop on language in social media. Montreal, Canada, 7 June 2012, pp.65–74. Stroudsburg, PA: Association for Computational Linguistics.

Black

Mascaro

Gallagher

et al .(2012) Twitter zombie: Architecture for capturing, socially transforming and analyzing the Twittersphere. In: Proceedings of the 17th ACM international conference on supporting group work. Sanibel Island, FL, 27–31 October 2012, pp.229–238. New York, NY: Association for Computing Machinery.

Blondel

Guillaume

Lambiotte

et al .(2008) Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10): P10008 https://doi.org/10.1088/1742-5468/2008/10/P10008 .

boyd

(2011) Social network sites as networked publics: Affordances, dynamics, and implications. In: Papacharissi

(ed) A networked self, New York, NY: Routledge, pp. 39–58.

boyd d, Golder S and Lotan G (2010) Tweet, tweet, retweet: Conversational aspects of retweeting on Twitter. In: 2010 43rd Hawaii international conference on system sciences (HICSS), Hawaii, 5–8 January 2010, pp.1–10. Washington, DC: IEEE Computer Society.

Brown

(2012) Finding and identifying text in 900 + languages. Digital Investigation 9: S34–S43.

10.

Bruns A and Burgess JE (2011) The use of Twitter hashtags in the formation of ad hoc publics. In: Proceedings of the 6th European consortium for political research (ECPR) general conference 2011, University of Iceland, Reykjavik, 27 August 2011. Available at: http://www.ecprnet.eu/conferences/general_conference/reykjavik/ (accessed 9 May 2018).

11.

Bruns

Stieglitz

(2013) Towards more systematic Twitter analysis: Metrics for tweeting activities. International Journal of Social Research Methodology 16(2): 91–108.

12.

Ceron

(2017) Intra-party politics in 140 characters. Party Politics 23(1): 7–17.

13.

Christensen

(2011) Discourses of technology and liberation: State aid to net activists in an era of “Twitter Revolutions.”. The Communication Review 14(3): 233–253.

14.

Dubois

Gaffney

(2014) The multiple facets of influence identifying political influentials and opinion leaders on Twitter. American Behavioral Scientist 58(10): 1260–1277.

15.

Emmons

Kobourov

Gallant

et al .(2016) Analysis of network clustering algorithms and cluster quality metrics at scale. PLoS One 11(7): http://doi.org/10.1371/journal.pone.0159161 .

16.

Engesser

Ernst

Esser

et al .(2017) Populism and social media: How politicians spread a fragmented ideology. Information, Communication & Society 20(8): 1109–1126.

17.

Enli

(2017) Twitter as arena for the authentic outsider: Exploring the social media campaigns of Trump and Clinton in the 2016 US presidential election. European Journal of Communication 32(1): 50–61.

18.

Faris

Rahimi

(2015) Social Media in Iran: Politics and Society After 2009, Albany, NY: State University of New York Press.

19.

González-Bailón

Borge-Holthoefer

Rivero

et al .(2011) The dynamics of protest recruitment through an online network. Scientific Reports 1: 197.

20.

Grimmer

Stewart

(2013) Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis 21(3): 267–297.

21.

Heo

Park

Kim

et al .(2016) The emerging viewertariat in South Korea: The Seoul mayoral TV debate on Twitter, Facebook, and blogs. Telematics and Informatics 33: 570–583.

22.

Howard

(2010) The Digital Origins of Dictatorship and Democracy: Information Technology and Political Islam, Oxford: Oxford University Press.

23.

Howard

Hussain

(2012) Democracy's Fourth Wave? Digital Media and the Arab Spring, Oxford: Oxford University Press.

24.

Jacomy

Venturini

Heymann

et al .(2014) ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi Software. PLoS One 9(6): https://doi.org/10.1371/journal.pone.0098679 .

25.

Jungherr

(2015) Analyzing Political Communication with Digital Trace Data: The Role of Twitter Messages in Social Science Research, Cham: Springer.

26.

Jungherr

(2016) Twitter use in election campaigns: A systematic literature review. Journal of Information Technology & Politics 13(1): 72–91.

27.

Jungherr

Schoen

Jürgens

(2015) The mediation of politics through Twitter: An analysis of messages posted during the campaign for the German federal election 2013. Journal of Computer-Mediated Communication 21(1): 50–68.

28.

Kelly J and Etling B (2008) Mapping Iran's Online Public: Politics and Culture in the Persian Blogosphere. Report, Berkman Center, April.

29.

Khazraee

Losey

(2016) Evolving repertoires: Digital media use in contentious politics. Communication and the Public 1(1): 39–55.

30.

Khonsari KK, Nayeri ZA, Fathalian A, et al. (2010) Social network analysis of Iran's Green Movement Opposition groups using Twitter. In: 2010 international conference on advances in social networks analysis and mining, Odense, Denmark, 9–11 August 2010, pp.414–415. IEEE.

31.

Kim

Lee

Park

(2016) Delineating the complex use of a political podcast in South Korea by hybrid web indicators: The case of the Nakkomsu Twitter network. Technological Forecasting and Social Change 110: 42–50.

32.

Lefebvre

(1991) The Production of Space, Cambridge, MA: Blackwell.

33.

Lotan

Graeff

Ananny

et al .(2011) The revolutions were tweeted: Information flows during the 2011 Tunisian and Egyptian revolutions. International Journal of Communication 5(0): 1375–1405.

34.

McGregor

(2018) Personalization, social media, and voting: Effects of candidate self-personalization on vote intention. New Media & Society 20(3): 1139–1160.

35.

Manning C and Klein D (2003) Optimization, maxent models, and conditional estimation without magic. In: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology: tutorials – volume 5 (eds M Hearst and M Ostendorf), Edmonton, Canada, 27 May–1 June 2003, pp.8–8. Stroudsburg, PA: Association for Computational Linguistics.

36.

Martínez-Cámara

Martín-Valdivia

Ureña-López

et al .(2014) Sentiment analysis in Twitter. Natural Language Engineering 20(1): 1–28.

37.

Mascaro CM (2015) Technologically mediated discourse and information exchange through medium specific syntactical features: The 2012 presidential election on Twitter. PhD Thesis, Drexel University, PA.

38.

Morales

Losada

Benito

(2012) Users structure and behavior on an online social network during a political protest. Physica A: Statistical Mechanics and Its Applications 391(21): 5244–5253.

39.

Morozov

(2009) Iran: Downside to the ‘Twitter Revolution.’. Dissent Magazine 10–14. Fall 2009.

40.

Morozov

(2011) The Net Delusion: The Dark Side of Internet Freedom, 1st ed. New York, NY: Public Affairs.

41.

Mottahedeh

(2015) #iranelection: Hashtag Solidarity and the Transformation of Online Life, Stanford, CA: Stanford Briefs, an imprint of Stanford University Press.

42.

Ott

(2017) The age of Twitter: Donald J. Trump and the politics of debasement. Critical Studies in Media Communication 34(1): 59–68.

43.

Papacharissi

(2011) A Networked Self: Identity, Community and Culture on Social Network Sites, New York, NY: Routledge.

44.

Papacharissi

Oliveira

(2012) Affective news and networked publics: The rhythms of news storytelling on #Egypt. Journal of Communication 62(2): 266–282.

45.

Park

Lim

et al .(2016) Expanding the presidential debate by tweeting: The 2012 presidential election debate in South Korea. Telematics and Informatics 33: 557–569.

46.

Paßmann

Boeschoten

Schäfer

(2014) The gift of the gab: Retweet cartels and gift economies on Twitter. In: Weller

Bruns

Burgess

et al .(eds) Twitter and Society, New York, NY: Peter Lang, pp. 331–344.

47.

Penney

Dadas

(2014) (Re)Tweeting in the service of protest: Digital composition and circulation in the Occupy Wall Street movement. New Media & Society 16(1): 74–90.

48.

Pezzoni

Passarella

et al .(2013) Why do I retweet it? An information propagation model for microblogs. In: Social Informatics, Cham: Springer, pp. 360–369.

49.

Rajadesingan A, Zafarani R and Liu H (2015) Sarcasm detection on Twitter: A behavioral modeling approach. In: Proceedings of the eighth ACM international conference on web search and data mining (eds X Cheng, H Li, E Gabrilovich, et al.), Shanghai, China, 2–6 Febuary 2016, pp.97–106. New York, NY: Association of Computing Machinery.

50.

Rauchfleisch

Schäfer

(2015) Multiple public spheres of Weibo: A typology of forms and potentials of online public spheres in China. Information, Communication & Society 18(2): 139–155.

51.

Reuters (2017) Twitter reports 6 pct increase in monthly active users, 26 April. Available at: https://www.reuters.com/article/twitter-results/twitter-reports-6-pct-increase-in-monthly-active-users-idUSL4N1HY48L (accessed 1 September 2017).

52.

Senft

(2008) Camgirls: Celebrity and Community in the Age of Social Networks, New York, NY: Lang.

53.

Stephansen

Couldry

(2014) Understanding micro-processes of community building and mutual learning on Twitter: A ‘small data’ approach. Information, Communication & Society 17(10): 1212–1227.

54.

Tiryakian

Gruzd

Wellman

et al .(2011) Imagining Twitter as an imagined community. American Behavioral Scientist 55(10): 1294–1318.

55.

Tripathi

Naganna

(2014) Opinion mining: A review. International Journal of Information & Computation Technology 4(16): 1625–1635.

56.

Wang

Chu

(2017) Networked publics and the organizing of collective action on Twitter: Examining the #Freebassel campaign. Convergence: The International Journal of Research into New Media Technologies 1–16.

57.

Web Ecology Project (2009) The Iranian election on Twitter: Web Ecology Project. Available at: http://www.webecologyproject.org/2009/06/iran-election-on-twitter/ (accessed 31 December 2014).

58.

Weller

Bruns

Burgess

et al .(2014) Twitter and Society, New York, NY: Peter Lang.

59.

Wojcieszak

Smith

(2014) Will politics be tweeted? New media use by Iranian youth in 2011. New Media & Society 16(1): 91–109.

60.

Wojcieszak M, Smith B and Enayat M (2013) Finding a way: How Iranians reach for news and information. Report, Iran Media Program, Philadelphia, 2011–2012.

61.

Yang

Algesheimer

Tessone

(2016) A comparative analysis of community detection algorithms on artificial networks. Scientific Reports 6: 30750.

62.

Zafarani

Abbasi

Liu

(2014) Social Media Mining: An Introduction, New York, NY: Cambridge University Press.

63.

Zhou Z, Bandari R, Kong J, et al. (2010) Information resonance on Twitter: Watching Iran. In: Proceedings of the first workshop on social media analytics (eds P Melville, J Leskovec and F Provost), Washington, DC, 25–28 July 2010, pp.123–131. New York, NY: Association for Computing Machinery.

Mapping the political landscape of Persian Twitter: The case of 2013 presidential election

Abstract

Keywords

Introduction

Literature review

Social media as networked publics in Iran

Twitter as a medium for political communication

Twitter in Iran

Data collection and methods

Data collection

Data analysis

Findings

Descriptive analysis

Network analysis

Textual analysis

Discussion

Footnotes

Declaration of conflicting interests

Funding

Notes

References