Abstract
The fallacy of premature designations such as “Iran's Twitter Revolution” can be attributed to the empirical gap in our knowledge about such sociotechnical phenomena in non-Western societies. To fill this gap, we need in-depth analyses of social media use in those contexts and to create detailed maps of online public environments in such societies. This paper aims to present such cartography of the political landscape of Persian Twitter by studying the case of Iran's 2013 presidential election. The objective of this study is twofold: first, to fill the empirical gap in our knowledge about Twitter use in Iran, and second, to develop computational methods for studying Persian Twitter (e.g., effective methods for analyzing Persian text) and identify the best methods for addressing different issues (e.g., topic detection and sentiment analysis). During Iran's 2013 presidential election, three million tweets were collected and analyzed using social network analysis and machine learning. The findings provide a more nuanced view of the political landscape of Persian Twitter and identify patterns in accordance with or in contrast to those identified in the English-speaking Twittersphere around the 2013 presidential election. Persian Twitter was dominated by micro-celebrities, whereas institutional elites dominated English discourse about Iran on Twitter. The results also illustrate that Persian Twitter in 2013 was predominantly in favor of reformists. Finally, this study demonstrates that sentiment analysis toward political name entities can be used efficiently for mapping the political landscape of conversation on Twitter.
This article is a part of special theme on Social Media & Society. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/social-media-society.
Introduction
On 15 June 2009, millions of Iranian citizens marched through the streets of Tehran to protest the outcome of the disputed presidential election in which Mir-Hossein Mousavi, a reformist candidate, was running against the incumbent conservative candidate Mahmoud Ahmadinejad. The next day, the Washington Times wrote an editorial titled “Iran's Twitter Revolution.” 1 The fallacy of this description has been thoroughly discussed in subsequent scholarship (Howard, 2010; Morozov, 2011). However, during the following uprisings in Iran's Green Movement, and later during the Arab Spring, many political elites and scholars overemphasized the role of social media. While many young Iranians used social media—mainly Facebook—in 2009 (Howard and Hussain, 2012; Khazraee and Losey, 2016), Twitter did not play any role in mobilizing people in Iran (Christensen, 2011; Morozov, 2009). One reason for misunderstandings such as this is an empirical gap in our knowledge about such sociotechnical phenomena in non-Western societies. To fill this gap, we need in-depth analyses of social media use in those contexts.
Social media scholarship is an active area of research, but most of this research has focused on Western societies. There are fewer studies that address the international uses of social media. While there is a growing body of literature about social media use in Iran (Faris and Rahimi, 2015; Khazraee and Losey, 2016; Wojcieszak and Smith, 2014; Wojcieszak et al., 2012), there are still major gaps in our understanding of the political landscape of Persian Twitter. As with any unknown area, to better understand what exists in a domain and how things are related to each other, we need to map its landscape. In the absence of empirical maps, there is always a danger of projecting our impressions, as happened in the case of the so-called Twitter Revolution. However, to create empirical maps, we need methodically and rigorously collected evidence. Data collection through public opinion surveys and other traditional social science research methods is highly restricted in countries such as Iran. Fortunately, recent technological developments and the widespread use of social media enable us to use digital trace data (Jungherr, 2015) and Big Data analytics to create detailed maps of social interactions in online public environments, which can better represent those communities.
This paper aims to present such a cartography of the political landscape of Persian Twitter by studying the case of Iran's 2013 presidential election. Its objective is twofold. First, as mentioned above, it aims to fill the empirical gap in our knowledge about Twitter use in Iran. Second, it aims to deploy computational methods for studying Persian Twitter and identify effective ones for addressing different issues (e.g., topic detection and sentiment analysis). The purpose of this study is to illustrate an approach to fill both the empirical and the methodological gaps in studying social media use in Iran using big social data and computational methods. To that end, this paper provides a more nuanced view of the political landscape of Persian Twitter discourse and identifies patterns in accordance with or in contrast to those identified in the English-speaking Twittersphere around the 2013 presidential election. It also discusses the mechanisms of information diffusion in that landscape.
Literature review
Social media as networked publics in Iran
boyd (2011) defines networked publics as “publics that are restructured by networked technologies” (p.39). She argues that as we move our social lives more and more to networked publics, it is critical to understand the affordances, dynamics, and implications of networked publics for social life. Papacharissi (2011) discusses the mutual transformation of individuals, identities, and citizenship through an investigation of the relationship between digital media and democracy. She argues that digital citizenship is civic responsibility enabled by digital technologies. Networked publics serve many of the functions of traditional publics, including civic engagement and connecting people to others beyond their familial or physical circles; however, they also introduce new affordances such as replicability, scalability, and searchability (boyd, 2011). Public space can be defined as a physical or social phenomenon (Lefebvre, 1991). Following the notion of the social construction of space, under repressive cultures, social media create opportunities to form social spaces for deliberation, functioning as short-term public spheres that enable discussions of sensitive topics and the presentation of dissenting views (Rauchfleisch and Schäfer, 2015). Social media have also emerged as part of repertoires for contentious politics (Khazraee and Losey, 2016). In Iran, social media created such public spaces in different forms, mostly to fill the absence of free public spaces (Faris and Rahimi, 2015; Khazraee and Losey, 2016).
Iranians were among the early adopters of social media. In the 2000s, the Iranian blogosphere developed as a national and international platform for vibrant discussions around diverse topics, including politics. “Blogestan,” as it is often called, emerged as a leading platform for user-generated content. Blogestan also offered a central platform for contentious issues such as politics (Faris and Rahimi, 2015; Kelly and Etling, 2008; Khazraee and Losey, 2016). Kelly and Etling (2008) conducted one of the earliest studies investigating the landscape of social media use in Iran by mapping Blogestan. Employing social network analysis (SNA) and automated content analysis, they created a high-resolution image of thousands of Persian blogs. The outcome of that study revealed the gap in prevalent beliefs about Blogestan and demonstrated how empirical research using computational methods can change our understanding of online publics. Prior to Kelly and Elting's work, it was widely assumed that Blogestan was dominated by pro-reformist bloggers. However, their study illustrated that Blogestan was relatively evenly divided between reformist and conservative bloggers. While there is a growing body of work on social media in Iran, there is still a pressing need for further studies along the lines of the work done by Kelly and Etling, using big social data to investigate the landscape of online publics in Iran, including different social media outlets.
Twitter as a medium for political communication
With its 328 million monthly active users communicating in over 40 different languages (Reuters, 2017), Twitter has become a popular medium not only for personal information sharing and interpersonal communication but also for more structured collective activities during social and political events. Twitter allows users and their communications to be visible and identifiable by others; it allows users to connect by using common hashtags, mentioning one another, and replying to one another; and communication flow can be internally coordinated by the members of a self-established communication network (Wang and Chu, 2017).
Twitter's role in political communication in different parts of the world, especially around elections and campaigning, has increasingly been studied in recent years (Weller et al., 2014). Such studies have generally examined the use of Twitter as a communication channel for social protests (Penney and Dadas, 2014), political election campaigns (Enli, 2017; Heo et al., 2016; Jungherr, 2016; Jungherr et al., 2015; Kim et al., 2016; McGregor, 2018; Ott, 2017; Park et al., 2016), and revolutionary movements during political upheavals (Lotan et al., 2011). In these contexts, Twitter has been used either by politicians (parties and candidates) as a platform for self-representation, or by citizens or “Twitter publics” (Jungherr, 2016), as a communication tool for arranging social and political movements (Anstead and Chadwick, 2018; Ceron, 2017; Engesser et al., 2017).
Some scholars have focused on the use of Twitter around party politics. Dubois and Gaffney (2014) identified the most influential political players within the two largest Canadian political communities on Twitter through a comparative study of different SNA measures of indegree, eigenvector centrality, clustering coefficient, knowledge, and interaction. They revealed that political elites such as media outlets, journalists, and politicians were most influential; political commentators and bloggers were somewhat less influential; and opinion leaders were still less influential, mainly influencing those in their immediate personal networks. Another study focused on two cases: the Dutch parliamentary Twittersphere, which reveals “functional interactions between professional elites,” and German Twitter Twittersphere, which reveals “two loosely connected networks with quite different core interests: net politics and fun” (Paßmann et al., 2014: 333). The authors argue that users' behaviors are dependent on their particular socio-cultural contexts and discuss how such behaviors contribute to the way retweeting, favoring messages, and circulation of Twitter messages occur in general.
Other scholars have focused on how ordinary users engage with contentious politics on Twitter. These studies paid attention to news creation and communication flows, and key players controlling it. Papacharissi and Oliveira (2012) studied #egypt communications on Twitter during the 2011 Egyptian uprisings leading to the resignation of Egyptian President Mubarak. In this large-scale study, the authors explored four prominent news values specific to Twitter that made this microblogging site a platform for turning events into news stories in a context where access to mainstream news was restricted or controlled by the government. The news values were instantaneity (retweets and requests for instant updates), crowdsourced elites (participation of key bloggers, activists, and informed citizens), solidarity (communication of words of solidarity), and ambience (creation of an ambient information-sharing environment). Morales et al. (2012) examined the 2010 #SOSInternetVE Twitter protest, which was organized to support freedom of information on the internet in Venezuela, through complex network analysis. They identified retweets as the main mechanism for information propagation in such environments. Lotan et al. (2011) focused on the Twitter communication flow during the 2011 Tunisian and Egyptian uprisings and identified the key actors of the network to be mainstream media organizations, individual journalists, influential regional and global actors, and other ordinary users who actively tweeted posts related to these two revolutions. González-Bailón et al. (2011) studied Twitter political communications related to the 15-M movement in Spain in 2011 and identified two groups of key users: early participants, who were the activists leading the protests, and those who were influenced by this group and joined at a later time but eventually became the more central actors in the network in disseminating information. They concluded that users who acted as seeds of message cascades (i.e., the spreaders of information) held more central positions in the network.
Heo et al. (2016) investigated the role of social media including Twitter in relation to traditional media during Seoul's 2011 mayoral election. They concluded that social media users actively participate in the production and restructuring of political discourse beyond what is presented by traditional media. The same authors also studied Twitter issue networks formed around the 2102 presidential debates in South Korea (Park et al., 2016). Their findings show that political discussion around elections forms homogeneous communities centered on political ideologies and geographic locations. They also demonstrate that the landscape of Korean Twitter during televised debates favored the opposition candidate. In a similar study, Kim et al. (2016) discuss the role of Twitter in a hybrid media system in the context of elections. They discuss how Twitter was used for direct interaction among users of social media and the listeners of a popular political podcast. Such studies highlight the need for further research focusing on non-Western contexts.
Participation in online publics through Twitter can also contribute to the formation of communities of like-minded users that can be delineated across ideological, political, or other personal attributes (Bäck et al., 2018; Bruns and Burgess, 2011; Gruzd et al., 2011; Stephansen and Couldry, 2014). Bäck et al. (2018), studying xenophobic online forums, argue that participation in such online communities can contribute to a shift from an individual identification to a group one over time. Gruzd et al. (2011) discuss how social media make multiple community memberships possible and help people to form new connections while maintaining their existing ones. They argue that Twitter can form the basis of interlinked personal communities. Bruns and Burgess (2011) argue that political discussions on Twitter lead to the formation of different communities, sometimes because of preplanning and sometimes as an ad hoc result of discussion and deliberation. Discussing community formation on Twitter, Park et al. (2016) argue that communities on Twitter reflect ideological or geographical communities. Therefore, studying community formation on Persian Twitter can also be key to understanding how users interact in that space and how that relates to different political orientations.
Twitter in Iran
Twitter received considerable media attention in connection with the 2009 protests, despite a low user base and low use within Iran (Christensen, 2011; Wojcieszak and Smith, 2014). As Christensen writes, “[t]he protests were labeled by some as a Twitter Revolution, despite the fact that there were just over 19,000 Twitter users in Iran out of a total population of just under 80 million” (2011: 238). However, the vast majority of Twitter's users tweeting about the protests were not based in Iran (Howard, 2010). Twitter was blocked preceding the 2009 elections in Iran, and, aside from a brief reprieve in 2013, has remained blocked since. Later, a study of young Iranians' use of social media confirmed that in 2009, Twitter played a minimal role in the political engagement of that demographic, which was the most technologically savvy population in Iran (Wojcieszak and Smith, 2014). However, Twitter did play an important role as an international platform for disseminating news about politics in Iran through popular hashtags such as #IranElection (Howard, 2010; Mottahedeh, 2015).
While some scholars have examined the use of Twitter in Iran (Christensen, 2011; Faris and Rahimi, 2015; Howard, 2010; Wojcieszak and Smith, 2014), there have been few studies using computational methods investigating the digital trace of users' interactions on Twitter. Shortly after the disputed 2009 election in Iran, the Web Ecology Project (2009) published a report of users' interactions on Twitter over the course of 18 days around the election (7–26 June 2009). The report investigated approximately two million tweets and provided a statistical overview of the population and activities during that period, and identified influential users, taking into account both numbers of original tweets and numbers of followers. However, the report did not go beyond a descriptive analysis of the data set. In another study, Khonsari et al. (2010) focused on the 2009 Iranian presidential elections, and through an analysis of tweets posted by opposition groups, revealed two main connected clusters in the network: people who opposed the government (70% of the nodes) and people who supported it (24% of the nodes). However, the authors did not provide details on how they identified political affiliations and whether the users they studied were Persian-speaking users only. Another study of Iran's 2009 election (Zhou et al., 2010) investigated the dynamics of information propagation during that period by analyzing three million tweets related to the Iranian election posted by approximately 500 thousand users during June and July of 2009. The authors initiated their data collection by studying the hundred most influential users provided by the Web Ecology Project (2009) and investigated a follower–followee network of 20 million users to identify tweets related to the Iran election. The paper investigated the structure and mechanisms of information propagation on Twitter and illustrated that the information cascades are shallow and that the depth of 99% of the cascades is less than 3. The authors also claimed that trending and the search bar play important roles in defining the retweet behavior of users beyond the follower–followee networks. While these studies provide some information about political engagement on Persian Twitter, there are still gaps in our understanding of how the political landscape is divided among different political orientations and the structure of information diffusion processes in this landscape. Present study, first, aims to improve our understanding of the political landscape of Persian Twitter. It is also an attempt to fill the existing gap in the use of computational methods for studying the Iranian Twittersphere. We aim to use big social data collected from user interactions on Twitter to present a more accurate mapping of the political landscape of Persian Twitter (similar to Kelly and Etling's study (2008) on Blogestan). The 2013 presidential election is a particularly significant context for studying Twitter use in Iran, since it followed a strict ban on Twitter use after the 2009 election.
Data collection and methods
Data collection
We used the Personal Zombie application (Black et al., 2012), which uses the REST API search feature to collect statuses containing hashtags and keywords occurring within the last seven days. Personal Zombie can collect all tweets in the search index going backward in time to the point of the application's last run by matching the last-retrieved tweet ID. This strategy worked well with the size of retrieved data and did not face the challenges posed by Twitter's rate limit. It also enabled us to retro-collect data when we noticed a new trend or hashtag.
Many Iranian users use English hashtags for their Persian tweets (e.g., the popular #IranElection); therefore, we included both Persian and English keywords in our effort to collect a more inclusive data set. We began data collection with a set of initial keywords, including the names of Presidential candidates in Persian and English (with multiple possible spellings in English). Every day during the data collection period, we monitored the data set to identify emerging keywords and hashtags and added them to the data collection to backtrack them using the affordances of REST API. At the end of the data collection, we had a total of 47 keywords and hashtags (see Appendix 1).
Because our goal was to map the political landscape of Twitter around Iran's presidential election, we decided to collect data both before and after the election, in order to capture interactions of users in both periods. We collected data between 13 May 2013 and 29 June 2013 (four weeks before and two weeks after the election). At the end of our data collection period, we merged all collected data sets for each keyword and dismissed duplicate tweets that were collected multiple times for different keywords. The data collection yielded 3,006,528 tweets about Iran and the election, including 460,008 tweets in Persian.
Data analysis
After data preparation, which included removing duplicates and normalizing Persian text,
2
we processed and analyzed our data sets in two tracks. In the first track, we used SNA to understand the structure of communication networks on Twitter during the presidential election as well as to analyze the mechanisms of information diffusion (Figure 1). In the second track, we applied Natural Language Processing (NLP) for textual analysis to better understand the nature of discourse around the presidential election and the political landscape of the Persian Twittersphere (Figure 2).
First track of data analysis focused on social network analysis. Second track of analysis focused on textual analysis.

We focused our SNA on the retweet network because retweets can be used to trace information propagation among users (Arnaboldi et al., 2014; boyd et al., 2010; Pezzoni et al., 2013). We analyzed the retweet networks of Persian and English tweets in two separate data sets to identify differences between the information diffusion practices and the structures of these two networks.
Categories used to code the most influential users.
We sought to determine which of these categories was the most influential during the election by investigating which users held central positions in the retweet network. In order to understand how communities form around the diffusion of information on Twitter, we also conducted a community detection analysis on the retweet network. To this end, we used the “fast unfolding of communities in large networks” algorithm, also known as the Louvain method, for community detection (Blondel et al., 2008). This algorithm has been identified as one of the most efficient methods for this purpose (Emmons et al., 2016; Yang et al., 2016). We then collected information about the most influential community members from their Twitter profiles and tweets to understand the composition of major communities and their political orientations. To better understand how different communities are formed in relation to each other in the Persian retweet network, we used the forced-based algorithm Force Atlas 2 (Jacomy et al., 2014) to visualize the result of community detection analysis (Figure 5).
Force-based visualization of retweet network reveals the isolated MEK community.
In the second track of analysis (Figure 2), we used machine learning and NLP methods for language detection, sentiment analysis, topic detection, and sarcasm detection. The data processing workflow starts with a language identifier to detect the language of each tweet. Language detection for short texts can be problematic and inaccurate (Bergsma et al., 2012). To address this problem, we trained the Language-Aware String Extractor package 3 (Brown, 2012) to recognize Arabic, English, French, German, Persian, Spanish, Turkish, and Urdu. We achieved 98.10% accuracy on our test data set. Next, deployed supervised learning to train a classifier to identify subjects of Persian tweets. Finally, we detected the presence or absence of candidate(s) in those tweets. If the subject of a tweet was politics and there was at least one candidate's name in the tweet, we used the classifier to detect sarcasm, sentiment, and political orientation.
Summary of data annotation tasks for training machine learning classifier.
The political affiliations in Iranian politics can be confusing. There are two major political coalitions inside Iran, known as the Reformists (
) and the Conservatives or Principlists (
). The dissident groups residing outside Iran are broadly known as the Opposition in Persian political nomenclature. We applied the same categories for classification of political orientation. The Reformist movement in Iran formed after the election of Mohammad Khatami as president in 1997. Reformists in Iran are mostly supported by younger and urban demographics. The Reformist movement supports a more open sociocultural atmosphere in Iran while reducing tension in international relations. Conservatives support the status quo and the strict rule of religion, and usually appeal to more religious demographics and populations with lower socioeconomic status.
Sentiment analysis and opinion mining have been subject to vast research endeavors (Martínez-Cámara et al., 2014; Tripathi and Naganna, 2014). Two main approaches have been used for sentiment analysis on Twitter: unsupervised (or lexicon-based) methods and supervised methods (using a manually coded training data set). In this study, we did not use a lexicon-based method, for two main reasons. First, we did not have access to a reliable dictionary for term sentiments in Persian. Second, and more important, the use of dictionaries for sentiment analysis has been criticized for missing nuances (Grimmer and Stewart, 2013). Furthermore, the presence of sarcasm and the limited size of the text in each tweet raised concerns that the lexicon-based approach would not yield adequate results for sentiment analysis on Twitter data. In order to conduct sentiment analysis, we used N-gram features (unigram and bigram) for the training of the Stanford Classifier on our manually coded training data. The training resulted in 84.20% overall accuracy of sentiment detection of Persian tweets according to our 10-fold cross validation. We then applied this classifier to detect the sentiment toward various candidates. To do this, we first identified tweets that included a candidate's name, and then we predicted the sentiment toward that name entity. In this way, we were able to gauge the overall sentiment toward a candidate and their political party. We used sentiment toward candidates as a means to map the political landscape of the Persian Twittersphere.
Summary of the performance of textual analysis tasks.
Precision and recall cannot be reported for multiclass classification.
In the next section, we discuss the major findings of the two tracks of analysis.
Findings
Descriptive analysis
Distribution of languages in the data set.
Beyond languages, it is important to see who tweeted most. The distribution of user activities on Twitter is a long-tail distribution following power law. A small portion of users accounted for the majority of tweets. Only 7% of users tweeted more than five tweets with election-related content during the six weeks of data collection, while 71% of users only tweeted once during this period. This shows that a small portion of users at the time actively contributed to the content creation and dissemination, while the majority of users were mostly consumers of the content. This might be associated with security concerns related to participating in sensitive discussions online in Iran.
Figure 3 illustrates the temporal distribution of tweets per day in three groups of tweets (all tweets, Persian and English tweets, and only Persian tweets). The peaks marked on the graph show the major events during this six-week period, which began with the disqualification of Rafsanjani, a prominent politician and former president, followed by three presidential debates on television, and finally the election. The distance between different lines on the graph shows that only Persian users reacted to the debates, whereas all users reacted to the election results. While analyzing the temporal distribution of tweets, in the frequency of tweets per hour in Persian and English, we identified an anomaly. Figure 4 illustrates the frequency of tweets per hour in two Persian and English. While the number of tweets in English spiked in on election day, the number of tweets in Persian shows a surprising plateau. This anomaly may be associated with proactive censorship practices (see ASL 19 report).
5
After the 2009 election, the Iranian government used throttling as a mechanism to reduce the efficiency of internet communications. The results of ASL 19 monitoring of internet traffic in Iran show an increase in the number of attempts per user required to use the circumvention tool Siphon on election day. This is a significant finding, because it shows that investigating levels of social media use also reveals patterns of network disruption, and researchers should be aware of such incidents when interpreting their results.
Temporal distribution of tweets and their connection to events. Temporal distribution of Tweets per hour in two data sets of Persian and English Tweets illustrates a surprising plateau on the election day in Persian tweets.

Network analysis
Most visible. First, we analyzed the retweet network of Persian tweets. Our results show that Iranian micro-celebrities had the highest presence among the top hundred most retweeted users in Persian (not listed here due to privacy). The second most retweeted group was journalists. However, there were only eight official news/media outlets among the hundred most retweeted accounts. These media sources were: BBC news in Persian (@bbcpersian); Kaleme, a non-official news source of the Green Movement in Iran (@kaleme); Deutsche Welle radio in Persian (@dw_persian); Manoto Persian TV from London (@ManotoNews); the official Persian channel of the US Secretary of State (@USAdarFarsi); Mardomak, a non-official news source of opposition in Iran (@mardomak); and Radio Farda, Radio Free Europe in Persian (@RadioFarda_). It is important to note that none of the media sources from inside Iran were among the most retweeted accounts. This can be attributed to the Twitter ban in Iran and possible legal consequences of using the social network in Iran at the time of election in 2013. In English tweets—in contrast to the Persian tweets—the most retweeted users were from two institutional elite categories: journalists and news/media outlets, and micro-celebrities were not as prevalent as in the Persian tweet network. Among the top hundred most retweeted accounts, there were 26 news/media accounts, only two of which (@bbcpersian and @MehrnewsCom) tweet in Persian.
Most influential. In Persian retweet network, the only news/media outlet among top hundred influential users was BBC news in Persian (@bbcpersian), which ranks fourth on this list. Most of the users among the top hundred influential users fell into the Iranian micro-celebrity category. In contrast, for the English retweet network, the most influential categories were journalists and news/media. Two of the presidential candidates, Hassan Rouhani (@HassanRouhani) and MohammadReza Aref (@MohamadRezaAref), had high rankings, in the fourth and fifth places.
Our findings demonstrate a structural difference between the Persian and English Twittersphere. Discussion about Iran in Persian on Twitter is mostly dominated by micro-celebrities, whereas English Twitter is dominated by institutional elites (i.e., official media, news agencies, journalists, and politicians). Mascaro (2015) found that the political conversation on English Twitter during the 2012 US presidential election was dominated by institutional elites as well. This difference in power structure shows that Persian Twitter in 2013 was dominated by micro-celebrities which may be attributable to the ban on Twitter in Iran resulted in an absence of journalists and news/media outlets. However, this situation has changed since 2013, because Iran's president, Hassan Rouhani, and minister of foreign affairs, Javad Zarif, actively used Twitter shortly after the 2013 election, which encouraged journalists and news/media outlets in Iran to also use Twitter.
Community analysis. The results of our community detection in the larger data set of English retweet network mostly identified users based on their nationality. However, the result of community detection on the Persian retweet network was more informative. The forced-based visualization of communities, as marked in Figure 5, illustrates the presence of an isolated group in the network, the members of which are highly connected to each other but not connected to the rest of network. Next, we studied the profiles of the users in this community. We identified this community as the MEK. 6 It is important to note that the users in this group have very high levels of activity (high indegree centrality), but they do not have high eigenvector scores. This means that while they are very active and visible through retweeting each other, they do not have much influence on the overall conversation and the diffusion of information in the network. This finding underlines that the structural position of users should be considered in interpretation of the importance of users' messages. Bruns and Stieglitz (2013) discussed the two categories of the most active and most visible users in an analysis of Twitter networks, arguing that the most active users are not necessarily the most visible users. We want to extend this idea by arguing that even the most visible users are not necessarily the most influential ones. This means that how many times a message is passed in the network via retweets is not as important as who is retweeting that message and what the structural position of that user is in the network.
This finding also shows the affordances of force-based visualization algorithms in identifying notable isolated communities. By looking at such visualization of retweet networks, we may identify communities that are not easily detected through other methods.
Textual analysis
As discussed in the methods section, we achieved success in sentiment detection toward candidate names. Thus, we used the distribution of sentiments toward candidates to measure the distribution of political support for different political groups in Iran. Table 5 shows the portions of detected sentiment expressions for each candidate. The most expressions were toward Rouhani (39.20%) and Jalili (14.47%), who were the preferred candidates of reformist and conservative groups, respectively. Figure 6 shows the distribution of positive and negative sentiments toward different candidates and the ratio of negative to positive sentiments for each candidate. As can be seen from the positive to negative ratios, reformist candidates were far more popular, indicating that sentiment on Persian Twitter predominately favored reformist candidates. Our findings illustrate the different landscape of Persian Twittersphere with the Persian Blogsphere which was evenly divided between the reformist and conservative camps in 2008 according to Kelly and Etling (2008).
Distribution of positive and negative sentiments toward different candidates. The result of sentiment analysis for the tweets with mention of candidates' names. One of common issues in digital Persian text processing is replacing 
with
. The former one is an Arabic character incorrectly implemented in lieu of the latter by Microsoft Windows and Apple iOS; therefore, there are two characters with different Unicode values which can be used for writing
. On the surface they look similar but different character values. We included both to make sure collecting all related tweets.
One of the methodological contributions of this study in the use of sentiment analysis for mapping the political landscape of Twitter by training a classifier using N-gram features. Such method is proved to be more effective than the detection of the political orientation of tweets or using lexicon-based sentiment analysis methods, which might be problematic (Grimmer and Stewart, 2013).
Discussion
In the present study, we examined the political landscape of Twitter during Iran's presidential election in 2013. We also investigated the information diffusion processes on Twitter during this period, the most influential users, and major communities. First, we found that Persian Twitter had a different power structure from English Twitter during the period of study (our data set included both Persian and English tweets). Persian Twitter was predominately ruled by micro-celebrities—social media users who created a strong base of followers and employed a performance style to increase their popularity among readers, viewers, and those to whom they were linked online. Discussion about Iran on Twitter in English, on the other hand, was dominated by institutional elites such as politicians, journalists, and news/media outlets. This difference in the power structure may be the result of restrictions on Twitter use in Iran creating an environment mostly dominated by micro-celebrities. However, this structural difference may have altered since the election of Hassan Rouhani and the use of Twitter by some government officials, which has encouraged other institutional elites in Iran to use Twitter since 2013.
Second, our results show that the most visible users and messages are not necessarily the most influential ones. In other words, how many times a message is retweeted is not as important as who retweets the message and the structural position of that user in the network.
Our third finding is that different Persian social media platforms might have different political landscapes. The political landscape of Persian Twitter in 2013 was more dominated by reformists, by contrast Kelly and Etling (2008) found that the Persian Blogsphere was evenly divided between reformists and conservatives. This encourages us to study the ecology of platforms since political groups might use different platforms to spread their messages based on the affordances of those platforms.
The methodological contributions of our results are twofold. First, our results regarding the application of NLP show that sentiment analysis toward name entities can be more accurate in mapping the political landscape of conversation on Twitter than detecting political orientation from the content of tweets. Second, we noticed that force-based visualization algorithms can help us identify isolated groups and that analyzing those user profiles can reveal very interesting patterns of communication.
In the future, we would like to evaluate whether the power structure of Persian Twitter has changed since 2013. Such a study can inform if and how recent use of Twitter by Iranian politicians affected the power structure of its communication networks. We also believe that conducting qualitative interviews with influential users and collecting data through online surveys from Twitter users can enrich our perspective on the dynamics of the Persian Twittersphere. To improve our computational tools, we are interested in expanding our training data to improve the effectiveness and accuracy of our NLP toolkit.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
