Abstract
Can citizens impact the broader discourse about an organization and its legitimacy? While social media have empowered citizens to publicly question firms through large volumes of online evaluations, the high heterogeneity of their evaluations dilutes their impact. Our empirical study applying a threshold vector autoregressive model (TVAR) analysis of 2.5 million tweets and 1,786 news media articles tests the condition by which the heterogeneity of online evaluations converges and influences the broader media discourse. Although social media evaluations do not initially influence media legitimacy, they become influential after reaching a tipping point of refracted attention, which is created by high volume and convergence of individual evaluations around few aggregative frames. Thus, social media storms may influence the broader discourse about an organization when this discourse converges and reaches a tipping point, rather than merely through the massive participation of citizens.
Increased interest in social media by business and society scholars (Barnett et al., 2020; Etter et al., 2019; Glozer et al., 2019; Wang et al., 2021) stems from the radical changes introduced by social media. It is now possible for less-influential individuals to have an impact through interactions with others via micro-blogs (Glozer et al., 2019; i.e., social media sites allowing users to exchange brief messages). These individuals may be able to contribute to the social construction of organizational legitimacy, or social acceptance of organizations (Suchman, 1995), as they may shape the public discourse (Etter et al., 2018; Suddaby et al., 2017).
Although it is acknowledged that individuals increasingly contribute to the public debate shaping organizational legitimacy (Castelló et al., 2013; Etter et al., 2018; Glozer et al., 2019; Whelan et al., 2013), scholars still debate whether social media evaluations have a meaningful impact on businesses. Some suggest that these individual evaluations matter when they influence news media legitimacy, which provides credibility and visibility (Coombs & Holladay, 2012; Etter et al., 2019; Pfeffer et al., 2014). 1 Other studies counterargue that their influence on the broader context is limited by their heterogeneity (Barnett et al., 2020; Wang et al., 2021). Social media empower “heterogeneous constituents” (Wang et al., 2021, p. 5) who express their opinions based on diverging expectations, values, norms, and involvements around a variety of organizational issues (Colleoni et al., 2021; Etter et al., 2019). Although social media make it easier for individuals to broadcast their heterogeneous evaluations (Glozer et al., 2019), in aggregate, their influence fades as quickly as it emerges (Wang et al., 2021). This is due to the increasingly fragmented media landscape (Etter et al., 2019; Roulet & Clemente, 2018), which further fragments heterogeneous interactions about organizations (Wang et al., 2021). Barnett and colleagues (2020) conclude that high content heterogeneity hinders the creation of focused attention because no clear frame is identifiable because of the large volume of diverging expressions. We join this debate and argue that it is important to look further at heterogeneous evaluations and how they reach over time a significant critical mass of volume and converge, thereby influencing news media legitimacy. We therefore ask: How do heterogeneous legitimacy evaluations in social media converge and influence news media legitimacy?
To answer this question, we build on extant research suggesting that convergence in social media happens progressively and discursively (Albu & Etter, 2016; Arvidsson & Caliandro, 2016; Bennett & Segerberg, 2012; Colleoni et al., 2021; Illia et al., 2021). When individuals share evaluations posted by other users, rather than simply diffusing them verbatim, they co-associate their own content with that of others (Arvidsson & Caliandro, 2016; Bennett & Segerberg, 2012; Illia et al., 2021). Thereby, they contribute to form a hypertext (i.e., a text composed of blocks of texts and electronic links; Albu & Etter, 2016; Landow, 2006). This could increase heterogeneity because new evaluations co-associated in a hypertext are potentially infinite (Glozer et al., 2019) as many new individuals join the discussion. However, we contend that heterogeneous evaluations may converge when there is high refracted attention (i.e., when a high volume of individual evaluations discursively twist around the same few aggregative frames (Bennett & Segerberg, 2012; Colleoni et al., 2021). We hypothesize that when a tipping point (Gladwell, 2000; Goel et al., 2016; Kitching & Purcell, 2017; Van Nes et al., 2016) of refracted attention is reached, one clear frame emerges, increasing the likelihood that news media will cover citizens’ evaluations with similar legitimacy.
Empirically, we conduct a study about the Banca Monte dei Paschi di Siena (MPS), Italy’s third largest bank, and analyze the legitimacy expressed before, during, and after it was accused of wrongdoing by Twitter users and traditional news media. MPS constitutes an interesting case because Italians expressed strong disapproval about this bank in social media when it was accused by the public attorney of having used derivatives to counter losses from the acquisition of the bank Antonveneta, one of its main rivals. Italians were outraged as they considered MPS responsible for worsening the economic situation in Italy in the years immediately after the subprime crisis. Furthermore, MPS was rescued with public tax money, at a time when many bankrupted family businesses in Italy were not.
Our article contributes to extant research in several ways. First, it expands business and society research that studies the role of individuals in shaping organizational legitimacy at societal level (Barnett et al., 2020; Etter et al., 2018; Glozer et al., 2019). Specifically, it contributes to the debate on whether social media evaluations are too heterogeneous to be impactful within a broader context (Barnett et al., 2020; Blevins & Ragozzino, 2019; Ravasi et al., 2019; Wang et al., 2021). It suggests that even if initially social media evaluations do not influence media legitimacy, they gain influential value over time after reaching a tipping point of refracted attention.
Second, our study contributes to studies conceptualizing legitimacy as a discursive construction in the public sphere (Etter et al., 2018; Glozer et al., 2019) as they discuss how social approval, and specifically legitimacy, is generated across many individuals. Findings suggest that there is value of zooming out (i.e., in enlarging our interpretative lenses) and start studying legitimacy discourses at networked level, rather than only studying legitimacy dialogues at dyadic level. Only this way, we may be able to understand how individuals co-situate themselves with regards to others’ evaluations within a larger hypertexted discourse.
Third, methodologically, our study captures the longitudinal discursive evolution of legitimacy evaluations by combining new digital methods such as semantic network analysis (Bonini et al., 2016; De Nooy et al., 2005; Illia et al., 2016, 2021) with threshold vector autoregressive model (TVAR) analysis, the latter being widely used in economics to capture business cycles (Beaudry & Koop, 1993; Osińska et al., 2020; Pesaran & Potter, 1997), interest rates (Anderson, 1997; Nyberg, 2018; Pfann et al., 1996), stock returns (Chen & Yang, 2019; Domian & Louton, 1997), prices (Aslan et al., 2018; Yadav et al., 1994), and exchange rates (Balke & Wohar, 1998; Taylor, 2001). Thereby, our study contributes to studies on organizational legitimacy in social media (Etter et al., 2018) and more broadly to studies on social approval of organizations (Bundy & Pfarrer, 2015; Wang et al., 2021) by suggesting means of combining different methods to measure the point at which a critical mass of evaluations in social media will or will not influence organizational news media legitimacy.
The article is structured as follows. First, we review the debate on the influence of social media in business and society literature. We review previous studies on the role of attention in social media and define refracted attention, followed by the development of our hypothesis. Second, we present our data, measures, and analysis. Finally, we present the results and discuss the theoretical and methodological contribution of our findings for theory.
Organizational Legitimacy and the Tipping Point of Refracted Attention in Social Media
Social Media Influence on Organizational Legitimacy: A Contested Issue
As individuals increasingly express opinions about organizational legitimacy in social media, scholars have become interested in the broader influence of emerging discourses about organizations in social media.
Some scholars suggest that the impact of individuals’ legitimacy evaluations is limited as it is difficult, if not impossible, to identify a clear frame emerging from the heterogeneity of evaluations (Barnett et al., 2020). While individual evaluations on social media spread quickly, often driven by strong emotions, “the sheer volume of information available to constituents today precludes their ability to process and respond to all the negative news” (Wang et al., 2021, p. 10). Also, even if microdialogues “may potentially provide textual level frames through which people make sense of particular legitimacy struggles” (Glozer et al., 2019, p. 627), it has been argued that its heterogeneity does not allow the identification of one clear frame (Barnett et al., 2020). In consequence, individuals, news media, and organizations cannot assimilate heterogeneous stimuli coming from an increasingly fragmented media landscape (Roulet & Clemente, 2018) and therefore do not address, contest, or endorse citizens’ evaluations and concerns about organizations (Barnett et al., 2020; Wang et al., 2021).
Other scholars observe that social media users can become impactful and contribute to organizational policy change (Lazo, 2017), replacement of employees (Veil et al., 2011), destabilization of organizations (Toubiana & Zietsma, 2017), and even industry-wide change (Barberá-Tomás et al., 2019). This influence seems to be conditioned by intrinsic a-priori characteristics of social media content—such as their extreme emotionality (Toubiana & Zietsma, 2017) or visual characteristics (Barberá-Tomás et al., 2019)—and by rather exogenous ex-post characteristics related to the discursive context of the evaluation (Cha et al., 2010; Illia et al., 2021; Illia et al., 2022; Zappavigna, 2011). In a post-scandal phase, news media are influenced in their legitimacy by those social media evaluations bridging many diverse online communities, as this indicates that an organizational event still has a broad interest despite a scandal is over and the volume of tweets is low (Illia et al., 2022). During a scandal, organizational outcomes are influenced by evaluations that achieve a high volume in networked conversations (Illia et al., 2021). However, the influence of volume achieved discursively on news media legitimacy is yet to be explored.
Attention in Social Media
A condition for news media to pick up content from social media is that the content achieves a certain degree of attention first (i.e., a high volume of shared information; Neuman et al., 2014; Ragas et al., 2014; Vargo & Guo, 2016). When individuals create negative attention around an organizational issue on social media, it increases the probability that this same issue will be portrayed by the news media (Conway et al., 2014; Etter & Vestergaard, 2016; Illia, 2003; Meraz, 2011; Ragas & Kiousis, 2010). This is “reverse agenda setting” (Vargo et al., 2014), where social media democratize news creation to the point that individuals indicate to the news media how and what to focus their coverage on (Ragas & Tran, 2013). This phenomenon can be explained by journalists’ increased use of social media as news sources (Paulussen & Harder, 2014), monitoring the attention that issues attract on social media (Hermida, 2012; Meraz, 2011; Paulussen & Harder, 2014; Ragas & Kiousis, 2010; Ragas & Tran, 2013; Sayre et al., 2010; Vargo & Guo, 2016). Large-scale studies including hundreds of thousands of tweets (Neuman et al., 2014; Ragas et al., 2014) suggest that journalists’ news reporting is indeed influenced by the attention users devote to certain topics.
Refracted Attention in Social Media
Most studies on social media refer to attention as volume based on verbatim repetition of messages (Conway et al., 2014; Meraz, 2011; Neuman et al., 2014; Ragas et al., 2014). Yet, the creation of focused attention is limited, when no clear frame emerges from the large volume of heterogeneous opinions (Barnett et al., 2020; Wang et al., 2021). Recent studies have nonetheless shown that in social media, one visible frame can emerge (Arvidsson & Caliandro, 2016; Colleoni et al., 2021) when individuals express their heterogenous evaluations while at the same time relating to others (Arvidsson & Caliandro, 2016; Bennett & Segerberg, 2012; Colleoni et al., 2021; Rieder, 2012). This implies that individuals do not simply repeat content published by others verbatim but co-associate (Bennett & Segerberg, 2012; Etter & Albu, 2021; Illia et al., 2021, 2022) their own evaluations with those of others. This generates a hypertext (Albu & Etter, 2016; Jackson, 2007; Landow, 2006) composed of blocks of texts and the electronic links that connect them. This hypertext is characterized by fluidity (Jackson, 2007) and co-authorship (Landow, 2006), as each author intentionally adds fragments of new text while interacting with others (Bennett & Segerberg, 2011, 2012). When this happens, a refraction (i.e., “a twist, in the meaning of content happens discursively”; Colleoni et al., 2021, p. 4), as individuals re-appropriate content of others (Arvidsson & Caliandro, 2016), sharing previously unknown facts to self-publicize their own experiences (Bennett & Segerberg, 2011; Colleoni et al., 2012, 2021; Poell et al., 2016) rather than to inform others objectively. Like a wave changing direction when it passes through a new surface (Rieder, 2012), an evaluation in social media twists when, figuratively speaking, it passes from one block of text to another within the hypertext.
Given that each new fragment is added to a hypertext without a specific a-priori agreed plan among authors (Albu & Etter, 2016), a refraction of content in a hypertext may increase heterogeneous viewpoints (Barros, 2014). Even though a plurality of viewpoints may be potentially infinite (Glozer et al., 2019) and refraction may in principle create more dispersion (Poell et al., 2016; Rieder & Smyrnaios, 2012; Smyrnaios & Rieder, 2013), studies suggest that the added fragments in the hypertext are associated progressively with fewer selected aggregative frames (Arvidsson & Caliandro, 2016; Colleoni et al., 2021; Poell et al., 2016; Rieder, 2012) such that the hypertext evolves over time into a “coherent whole” (Jackson, 2007, p. 409). Through this process, refracted attention is created, that is, a high volume of individual evaluations twist progressively around the same fewer frames that function as discursive aggregators (Bennett & Segerberg, 2012; Colleoni et al., 2021). At this point, the multitude of heterogeneous information published by new users converges, as it is co-situated (Bennett & Segerberg, 2011) in the hypertext around one aggregative frame that becomes increasingly visible, the more co-associations are added.
For example, a recent study (Colleoni et al., 2021) showed that in 2011, during the subprime crisis, citizens who spread negative evaluations about banks retweeted evaluations of other citizens containing various hashtags, which function as aggregative frames in Twitter (Zappavigna, 2011, 2018, 2019). Adding their own negative evaluations about banks related to #netstrike #arrestbankers #nomoretayes #jobs #noviolence #nosurveillance created a negative and rather heterogeneous discourse in social media that evolved into a tweetstorm (Goodman, 2014) against banks, as new evaluations were added. Eventually, all new tweets converged and connected individuals from diverse groups with heterogeneous motivations, as new fragments of evaluations were paired with fewer hashtags such as #weare, #occupy, and #weare99. This limited number of hashtags acted as aggregative frames that allowed refraction and convergence at the same time. As more users joined the conversation on banks, by pairing their new evaluations around these aggregative frames, consisting of fewer hashtags, the users’ voices converged, but there was still heterogeneity contained within the hypertext.
The Tipping Point That Provides Impactful Convergence
We argue that refracted attention is crucial for individuals to participate in a discourse of legitimation, as it helps them to reassess their evaluations in the “use stage” (Tost, 2011, p. 697) of the legitimacy process. In this stage, individuals receive information that may be crucial for them to assimilate content received from others and decide whether they legitimatize an organization or not. The rise of fewer aggregative frames allows individuals to co-orientate to others’ evaluations within a discursive process of legitimacy construction (Glozer et al., 2019), forming evaluations in interaction with others (Schnider et al., 2018). This happens only when a certain tipping point (Gladwell, 2000; Goel et al., 2016; Kitching & Purcell, 2017; Van Nes et al., 2016) of refracted attention has been created, that is, after a certain critical mass of user evaluations has been co-paired with fewer aggregative frames. When citizens share others’ evaluations (Benkler, 2006; Kaplan & Haenlein, 2010), discursive legitimacy increases in volume and thus achieves attention in social media, as connections between diverse perspectives are potentially infinite (Castells, 2015; Dellarocas, 2003; Ellison & Boyd, 2013; Papacharissi, 2009). As citizens add their heterogeneous evaluations in a relatively dispersive way, attention is achieved but without convergence nor the salient emergence of a given frame. Eventually, citizens start to converge in the way they co-pair their evaluations with one another, as fewer aggregative frames are used as reference points of co-pairing their own evaluation. As new nuances of discursive legitimacy are added to fewer aggregative frames (Cha et al., 2010) convergence starts to emerge. This is initially rather weak because refracted attention is not incisive but when a considerable volume of heterogeneous evaluations co-pair with increasingly fewer aggregative frames, convergence strengthens because a critical mass of plural evaluation is added but is paired progressively to fewer aggregative frames. We argue that this tipping point allows legitimacy evaluations to become impactful because it creates “a change in the state” (Van Nes et al., 2016, p. 902) of influence between legitimacy evaluations expressed in social media and news media legitimacy, allowing news media to identify convergence between heterogeneous evaluations. Extant research considers that this discursive tipping point can be conceptualized as the point where an organizational issue changes status (Kitching & Purcell, 2017) as it refers to the moment at which the level of attention provided to an organizational issue creates awareness of it. We therefore hypothesize that when this threshold is reached, citizens’ evaluations are taken up by the news media, which is eventually influenced by their expressed legitimacy.
Methods
To investigate if and how legitimacy evaluations influence news media legitimacy by reaching a tipping point of refracted attention, we followed a purposive sampling procedure (Bryman & Cramer, 1994) and analyzed legitimacy evaluations expressed in social media and news media about MPS. Our research design involved the collection and analysis of large data in the form of unstructured text (Mayer-Schönberger et al., 2014). This kind of data provides researchers with new insights and new ways of seeing social reality (Mayer-Schönberger et al., 2014; McAfee & Brynjolfsson, 2012). Our approach suited the study of a high volume of data from social media interactions stemming from thousands of Twitter users that often lack clear structure (e.g., unknown source).
Data sets
Our data set is part of a broader project that analyses sentiment expressed in Twitter about multiple banks in Italy from early 2011 to early 2015, the years after the Occupy Wall Street movement emerged worldwide. For this article, we selected uniquely data on MPS from 2012 to 2014. The MPS scandal erupted in January 2013 when the MPS public attorney’s investigation on the acquisition of Antonveneta became public. However, we selected the year before and after, to investigate the existence of a tipping point that captures the change in the state of legitimacy of MPS, and thus to examine the evolution of evaluations before, during, and after the scandal’s peak. Our data set is composed of 154 weeks of social and news media data, January 2012 to December 2014.
Social media data
Social media data about the MPS scandal were extracted using the Twitter API. The tweets were collected by defining a list of keywords containing the name Monte Paschi di Siena or its acronym: MPS, Monte Paschi, MP Siena, and Paschi Siena. The number of tweets totalled 2,580,000 and included 5,535 hashtags. 90.3% of the hashtags were circulated by citizens having few or no followers, with only 9.7% by actors with a medium to high follower basis and account descriptions that suggest their role as online reporters or influencers.
News media articles
To assess the legitimacy evaluations in news media over the same period, we collected news media articles through the Factiva database using the keyword “Monte Paschi di Siena.” In accordance with prior studies on media legitimacy, we included the three national newspapers with the highest circulation and authority levels (Deephouse & Carter, 2005): Corriere della Sera, La Repubblica, and La Stampa. We collected 1,786 articles about MPS over the period of investigation, averaging 11 articles per week.
Variable Operationalization
Measure for news media legitimacy (dependent variable)
We used a machine-learning semiautomated procedure to measure media legitimacy (Deephouse & Carter, 2005; Etter et al., 2018). We used the supervised learning technique Naive Bayes Classifier (Katakis et al., 2005) with the lexicon designed by Loughran and McDonald (2011) to build an index of sentiment ranging from -1 to +1 that categorizes words as positive or negative. To both binary values, we added a category “uncertain” to gather terms that were missing from the original sentiment dictionary. We chose this lexicon not only because it has already been validated, but also because it was developed to analyze tonality expressed with regards to financial and banking institutions. The average value per week was calculated by weighting the legitimacy evaluation on the number of articles published per week and then normalized using a z-score, so that the series had a zero mean and a standard deviation of one.
Measure for citizens’ legitimacy evaluations (independent variable)
To identify the negative or positive evaluations about the bank in the collected tweets, we followed a recent study (Etter et al., 2018) that applied automated machine-learning techniques (Pang & Lee, 2008). Machine learning is the process of automatically discovering useful information in large data sets (Tan et al., 2014) and is applied to large-scale databases with unconventional data that need specific algorithms for analysis. In our case, a sentiment analysis algorithm was created with a machine-learning approach to capture legitimacy evaluation as an effective orientation toward organizations (Manning et al., 2008) which indicates the valence of evaluations.
Three researchers coded the same 1,459 tweets (10% of the dataset of re-tweets), obtaining a good level of intercoding reliability (Kalpha = .81, p = .026). Each tweet was given a unique sentiment value: negative (−1), neutral (0), and positive (+1). This coding was used to train the algorithm, which then coded the sentiment of the entire data set. Our algorithm employed a passive-aggressive model implemented with a pairwise coupling with a majority voting method to account for multi-class categorisation of words in our tweets (Hastie & Tibshirani, 1998). The quality of the feature extraction and classification model was confirmed by the experimental results obtained through a 10-fold cross-validation on the training dataset: f-measure 0.75 and accuracy 0.8. Following this computation, we calculated the sentiment per week and then normalized the measure using the z-score, so that the series had 0 mean and a standard deviation of 1.
Measure for refracted attention (threshold variable)
We defined refracted attention as the frequency of co-occurrence of an evaluation that has been retweeted in a hypertext, as it is linked to new aggregative frames posted by the users who retweet it. For instance, if a user tweets about #debenedetti (one of the biggest debtors of MPS involved in the bankruptcy), and two others retweet this tweet and respectively add #crash and #criminal to #debenedetti, then the hashtag #debenedetti will weigh 2 as it co-appears twice, once with #crash and once with #criminal.
To measure the frequency of all paired evaluations, we developed a semantic network analysis of the 2,580,000 tweets following two steps: first, we created a two-mode network (De Nooy et al., 2005), in which both the tweets and the (added) hashtags are the nodes. A tweet was linked to a hashtag only if the tweet contains that hashtag in its body. We then transformed it into a one-mode network, in which hashtags served as the only nodes, and tweets constituted the link between hashtags (De Nooy et al., 2005). The weight of each link indicated the number of tweets in which two hashtags are used together (Bonini et al., 2016; Illia et al., 2021). This procedure was accomplished for the period we defined (i.e., the week), based on previous studies suggesting that a conversation in Twitter typically lasts a week (Bruns & Stieglitz, 2013; Kwak et al., 2010; Wu et al., 2011). We found 154 semantic networks in which the presence and weight of each link represents the number of tweets and the node represents the hashtags that express individuals’ evaluations during a specific week (e.g., hashtag in the tweet of user 1, hashtags in the retweet of user 2).
Once we had drawn these weekly semantic networks—which in our case represents our weekly hypertexts—we measured the refracted attention as the number of co-appearances of a hashtag with other hashtags in a one-mode network. Values of our variable are as follows: 0 represents no refracted attention, as there is only one tweet with no co-pairing. Any value between 0 and 1 measures refracted attention. The closer the value is to 1, the more the refracted attention increases because the more one hashtag is co-paired other (added) hashtags. 1 is the highest possible value of refracted attention of our variable. In line with our theoretical argument, these new hashtags correspond to the new fragments added in the hypertext. This means that the refracted attention captures the ability of a hashtag to be an aggregator of otherwise dispersed hashtags and create volume discursively rather than just a measure of volume. Empirically, this distinction is confirmed by a correlation test, according to which refracted attention and volume have a moderate-low correlation of +0.17 (on a scale of −1 to 1) across the whole database. This means that these two variables capture different information. Refracted attention is intended to capture the discursive and conversational aspect of the discourse around the bank, not the simple volume of sharing. For example, if a discussion concerning the bank was made up of only one hashtag (e.g., #bankevil), meaning that there was only one angle of describing the bank, the refracted attention would be 0, as there is no co-appearance of hashtags and therefore no different cues of the organization. This would represent the emergence of one frame with no refraction and full adherence around a monolithic interpretation. Instead, if there was only one hashtag (e.g., #bankevil) used in combination with a high volume of other hashtags (e.g., #inequality, #killers, #violence, etc.) within many tweets, then refracted attention would be 1, as convergence around one frame allows inclusivity of heterogeneous frames that have become refracted by means of being co-paired with #bankevil.
In Figure 1, we provide a qualitative example from our data (one-mode network of 1 week in 2013), which shows the conversational aspect captured by the measure of refracted attention versus attention. For simplicity, we maintained only some node labels. Nodes represent the hashtags. The bigger the size of the node, the higher the volume (i.e., number of shares about that hashtag). The links represent a co-association between hashtags while sharing. More refraction is taking place when more links are associated with a node. The thickness of a link indicates that a specific refraction (i.e., co-association) is frequent. When a node has a high volume and many thick links, more refracted attention is occurring. For illustrative purposes, we explain refraction and volume (or lack thereof) of three hashtags: #monti, #PD, and #mussari. In that week, #monti has high volume, but no refraction, as there are no links. This means that many users share that frame that week, but none engage in a discourse around it. #PD has both high volume and high refraction as it is not only shared frequently but also associated with several other hashtags with thick links, which suggests that individuals, while re-tweeting the hashtag, refract its meaning. #mussari has low volume and moderate refraction, as it was apparently shared infrequently, but is associated with fewer links with hashtags than #PD, and less thick ones.

Examples illustrating volume with or without refraction.
Thus, if the hashtag complies to the above conditions, refracted attention is computed at the hashtag level as the frequency of hashtag usage within the hypertext. These frequency values were then averaged at the weekly level, hence coding the variable at the hypertext level. We then normalized the refracted attention variable using the z-score, so that the time series had a zero mean and a standard deviation of one.
Table 1 reports the descriptive statistics and correlation among the variables in our model and shows that both citizens’ evaluations and news media legitimacy are positively correlated with refracted attention and are only slightly correlated among themselves.
Descriptive Statistics and Correlation Matrix.
p < .05. ***p < .001.
Analysis: TVAR Model
To test our hypothesis that the impact of citizens’ legitimacy evaluations on news media legitimacy differs depending on the level of refracted attention generated on social media, we ran a nonlinear threshold autoregressive model (Tsay, 1998). These models assume the nonlinearity in the model to be governed by a threshold mechanism which generates different relationships between the variables observed based on the values of a threshold variable (i.e., regimes). In our case, we hypothesize that when the refracted attention is low, we will be in a regime where news media legitimacy is influenced by its past values but not by citizens’ evaluations on social media. Conversely, above a certain level of refracted attention, we expect to see these evaluations significantly influencing the news media legitimacy.
Switching between regimes is governed by a switching variable, so that any crossing above or below the threshold value in the switching variable, as identified by the model, will trigger a change of regime. Different regimes indicate that different linear models best describe the interdynamics among the same variables. Thus, the TVAR model allows us to test our hypothesis by showing that first, one variable (refracted attention) determines whether the entire system shifts from one regime to the next. Second, the same model applied to different regimes leads to different results. Together, these two findings show the importance of accounting for refracted attention to understand the nonlinearity of the relationship between citizens’ evaluations and news media legitimacy.
Our goal is to test if the level of refracted attention below the threshold value constitutes a low attention regime in which negative (or positive) evaluations expressed in social media do not influence negative (or positive) media legitimacy. The objective is to estimate if the level above the threshold value constitutes a high attention regime, where these influences are respectively significant.
Model Estimation
To estimate our model, we followed five steps of analysis: in phases I to IV, we tested the assumptions for model building, whereas in phase V, we estimated the actual model.
Test stationarity
We first ran the Dickey–Fuller/GLS (generalized least square) test for stationarity of the three variables: news media legitimacy, citizens’ legitimacy evaluations, and refracted attention. An important assumption of the linear VAR model is that the series are stationary. The variables were all found to be stationary.
Define lags
Then, we selected the optimal lag order by estimating the linear VAR model. The optimal lag order is determined using the Akaike Information Criterion (AIC), which is the most widely used information criterion in the empirical literature of model selection (Hastie et al., 2009). The AIC selects 1 lag as the optimal lag for the model.
Check exogeneity of threshold variable
Theory indicates that refracted attention (i.e., co-pairing hashtags) is a rather extrinsic and ex-post characteristic related to the discursive context where the evaluation is inserted rather than the evaluation itself. Hence it can be considered exogenous. However, before using the refracted attention as a threshold variable, we first controlled for its exogeneity. While the TVAR model allows for endogeneity of all its parameters, the threshold variable must be exogenous. We estimated the linear VAR model with the three variables and tested whether the citizens’ evaluations and the news media legitimacy variables were Granger-causing the refracted attention. We failed to reject the null (H0: News media Legitimacy_zscore and Citizens’ evaluation_zscore do not Granger-cause refracted attention_zscore), with F-Test = 1.0109, df1 = 6, df2 = 423, p value = 0.4175, hence, refracted attention is exogenous. We can therefore use refracted attention as a potential threshold variable in our model.
Confirm nonlinearity in our model
The next step was to test whether there was nonlinearity in our model. We assumed nonlinearity because high levels of refracted attention can lead to either a decrease in our dependent variable (in the case of high spreading of negative citizens’ legitimacy evaluations) or an increase (in the case of high spreading of positive citizens’ legitimacy evaluations). Therefore, refracted attention cannot be used as a normal regressor in a linear model but must be modeled as an enabler of the linear relationship between news media legitimacy and citizens’ legitimacy evaluations. When we have high levels of refracted attention, negative (positive) citizens’ legitimacy evaluations will impact negatively (positively) on news media legitimacy. From a statistical viewpoint, we are testing that, above certain levels of refracted attention, we observe a positive monotonic relationship between news media legitimacy and citizens’ legitimacy evaluations. To test the nonlinearity in our model, we applied the multivariate extension proposed by Lo and Zivot (2001) of the threshold linearity test of Hansen (1999). As in a univariate case, estimation of the first threshold parameter is made with a conditional least square estimator. For the second threshold, a conditional search with one iteration is made. Instead of an F-test comparing the SSR for the univariate case, a likelihood ratio (LR) test comparing the covariance matrix of each model is computed. The test checks the null hypothesis of linearity (m = 1 regime) against the alternative of nonlinearity (m = 2, 3 regimes). The calculation of the p value is done by simulation. The bootstrap distribution is based on resampling the residuals from the null model, estimating the threshold and computing the test. For all computations, we use 1,000 bootstrap replications. Results of the test confirm that the relationship between our independent and dependent variables is nonlinear (H0: linear model to be preferred over 2-regimes nonlinear model, F-test = 61.18538, p value ≤ 0.000; H0: linear model to be preferred over 3-regimes nonlinear model, F-test = 93.97164, p value ≤ 0.000) and that the dynamics are best described by a 2-regime TVAR model with refracted attention as a threshold variable (H0: 2-regimes nonlinear model to be preferred over a 3-regimes nonlinear model, F-test = 26.5257993, p-value = 0.6666667).
Estimate TVAR model
Given that all assumptions are satisfied, we estimated the following: a 2-regime TVAR model with news media legitimacy as the dependent variable, citizens’ evaluation, and past value of news media legitimacy as regressors and refracted attention as the threshold variable. To quantify the impact of the variables in the model, impulse response function (IRF) simulations were carried out for low and high regimes. IRF provides two key indications about the model: first, it allows us to easily assess what could happen to y if x changes by z percent; second, it allows us to estimate how long the effect would last. In our analysis, we quantified the impact that a +1 and -1 standard deviation change in news media legitimacy past values and citizens’ legitimacy evaluations would have on news media legitimacy in the low and high regimes of refracted attention, respectively.
Results
Table 2 reports the results of the threshold autoregressive model with refracted attention as the threshold variable: these results provide support to H1. The model shows that the variables influencing news media legitimacy vary across regimes. In the low regime (when refracted attention is below the threshold value), news media legitimacy is positively and statistically influenced by its past values (news media legitimacy at lag 1, b = 0.3828; p value ≤ 0), but not by citizens’ legitimacy evaluations. In the high regime (when refracted attention is above the threshold value), news media legitimacy is influenced by citizens’ legitimacy evaluations (citizens’ legitimacy evaluations at lag 1, b = 0.1586; p value ≤ 0.05), which have a positive and significant impact on future values of news media legitimacy. Furthermore, since news media legitimacy significantly depends on news media legitimacy’s past evaluations in the low but not in the high regime, the findings suggest that editorial continuity of news media is interrupted once the threshold of refracted attention is overcome, supporting our hypothesis that news media legitimacy is influenced primarily by social media once the threshold is surpassed.
Estimation Results of the Threshold Vector Autoregressive Model (TVAR) Model With 2 Regimes.
News media legitimacy = dependent variable; Citizen evaluation = independent variable; Refracted attention = threshold variable.
Signif. codes: *p < .05. ***p < .001.
Standard errors between brackets.
Best unique threshold: -0.2967161; Full sample size: 154; End sample size: 151.
Number of variables: 2; Number of estimated parameters: 8 + 1.
AIC 9.790288 BIC 85.22228 SSR 265.1002.
These results confirm a positive monotonic relationship between citizens’ legitimacy evaluations and news media legitimacy variables within regimes: the more citizens’ legitimacy evaluations are positive/negative, the more news media legitimacy is positive/negative. The threshold value for regime switching is +0.019 for the refracted attention variable.
We went back to our database at the hashtag level, rather than the weekly level, to characterize tweets that exceed this threshold. This allowed us to draw three conclusions that confirm the theory behind our hypothesis: First, there are few aggregative frames allowing citizens to co-situate their evaluation with one another, since only 7% of hashtags and tweets in our data set are above the threshold. Second, there are multiple heterogeneous evaluations linked to these few aggregative frames, since we observe that these 7% of tweets are co-associated with, on average, 1,087 other tweets. Third, the influence of citizens’ legitimacy evaluations in social media is driven by the general online public rather than fewer elite bloggers or influencers because refracted attention is pushed forward by users that have, on average, 5,889 followers (near the total sample average of 5,315). Tweets with refracted attention over the tipping point are those that drive the influence of social media legitimacy on news legitimacy, as they link heterogeneous evaluations into one frame, regardless of citizens’ a-priori popularity level.
To quantify the impact of an unexpected increase or decrease in citizens’ legitimacy evaluations and past news media legitimacy on news media legitimacy, we investigated the IRF in the low and high regimes, respectively. Figure 1 shows the impact that a +1 standard deviation increase in citizens’ legitimacy evaluations and past news media legitimacy, respectively would have on news media legitimacy in the low regime (top left figure), and in the high regime (top right figure). It also shows the impact a −1 standard deviation decrease in citizens’ legitimacy evaluations and past news media legitimacy, respectively, would have on news media legitimacy in the low regime (bottom left figure), and in the high regime (bottom right figure). In the high regime of refracted attention, a shock in citizens’ legitimacy evaluations creates a substantial change in the index of news media legitimacy which does not happen when there is a low regime of refracted attention. This means that after a certain threshold of refracted attention, the more an evaluation achieves refracted attention, the more news media legitimacy is influenced.
As the −1/+1 standard deviations are specular, we focus our explanation of Figure 2 on the bottom figures that refer to a −1 shock simulation. As predicted by our model, in the low regime (bottom left figure), the past values of news media legitimacy significantly impact news media legitimacy. The effect takes 5 weeks to wear off. A −1 standard deviation shock in news media legitimacy decreases news media legitimacy by 14%. In the high regime (bottom right figure), citizens’ legitimacy evaluations significantly and negatively impact news media legitimacy. The effect takes 3 weeks to wear off. A −1 standard deviation shock decreases the news media legitimacy by 4%.

Impulse response function (IRF) in low and high regimes for ±1 standard deviation (SD) shock on news media legitimacy.
These findings are supported by two extra analyses (see Appendix), which highlights that results are robust as they do not depend on the type of actors included in the sample or on the pure volume of sharing (i.e., attention without refraction) in the data set; also, they confirm that the switch from one regime to the other is conditioned only by the refracted attention and not the simple volume of tweets.
Discussion
Legitimacy evaluations expressed by a single evaluator can rarely be considered representative of organizational legitimacy for the whole society. Hence, it is important to analyze how organizational legitimacy is constructed across different types of evaluators and discourses. Our study has focused on legitimacy evaluations and discourses by two types of evaluators and their interrelationship, namely news media and citizens on social media. Our findings show that legitimacy expressed in social media influences news media legitimacy after a tipping point of refracted attention is reached, that is, when large volumes of heterogeneous evaluations converge through a hypertext. These findings expand business and society research (Barnett et al., 2020; Etter et al., 2018; Glozer et al., 2019), by emphasizing the enhanced role of individuals and social media in shaping how organizations are evaluated in society. Also, they contribute to studies on social approval of organizations (Bundy & Pfarrer, 2015; Wang et al., 2021) because they discuss how social approval is generated over time.
The Influence of Social Media: Tipping Point of Refracted Attention
The literature suggests that the influence of citizens in social media is typically undermined by information overload in a fragmented digital media landscape (Barnett et al., 2020; Blevins & Ragozzino, 2019), where it is difficult to identify a clear frame on organizational issues across multiple and heterogeneous expressions. However, other studies indicate that influence occasionally takes place (Lazo, 2017; Toubiana & Zietsma, 2017; Veil et al., 2011). We hypothesized that the impact of legitimacy evaluations in social media is conditioned by a tipping point of refracted attention. Refracted attention differs from attention conventionally studied in the context of social media (Hermida, 2012; Ragas & Tran, 2013; Vargo & Guo, 2016; Wang et al., 2021). It does not refer to the mere volume of sharing content (e.g., retweets by millions of users), but rather to volume through convergence, volume of content that is created while individuals create a twist in others’ evaluations, progressively and discursively. This conceptual difference allows us to understand that the influence of legitimacy expressed in social media happens as a discursive accomplishment that is conditioned by the degree to which individuals’ heterogeneous evaluations progressively crystallize (Bennett & Segerberg, 2011) around fewer discursive aggregative frames (Bennett & Segerberg, 2012; Colleoni et al., 2021) that act for citizens “as a medium to a multitude of diverse situations of identities” (Arvidsson & Caliandro, 2016, p.1). Through the refraction of others’ evaluations, they become co-situated in a hypertext that evolves into a coherent whole (Jackson, 2007).
These insights change our way of understanding, conceptualizing, and studying legitimacy evaluations expressed in social media, where the media landscape is increasingly fragmented (Etter et al., 2019; Roulet & Clemente, 2018). Extant research assumes that salience of social evaluations in social media depends on the participation of millions of individuals who rapidly generate a noise by spreading disapproval or approval through mere sharing (Barnett et al., 2020; Etter et al., 2018; Wang et al., 2021). We suggest, instead, that their salience depends on exceeding a tipping point of refracted attention that generates volume through convergence. By integrating the principle of the tipping point in our theorizing and hypothesis testing, we suggest that scholars in this field should acknowledge that millions of evaluations may not be influential unless they reach a tipping point of convergence (i.e., refracted attention) that situates them in a larger hypertext that is coherent. If scholars start to think in terms of tipping points and levels of refracted attention, rather than attention generated by mere volume through accumulation, new avenues for studying individuals’ legitimacy evaluations in social media may emerge, to further explore how legitimacy expressed by millions of actors in social media, although heterogeneous by nature, eventually converge and impact legitimacy expressed by other actors such as institutional evaluators.
Zooming Out to Understand Discursive Legitimacy
The conceptualization of refracted attention proposed by this study suggests that individual evaluations are co-situated in a larger hypertext and thus contributes to our understanding of how legitimacy is discursively created after evaluations are published in social media. Previous studies analyzed microdiscursive strategies within dialogues in social media (Glozer et al., 2019) which deepens our understanding of the heterogeneous and fluid nature of legitimacy in social media. By focusing on how these microinteractions are co-situated in a large hypertext, we can extend our knowledge of the larger picture of “crowd dynamics” (Jepperson & Meyer, 2011, p. 62) that allow heterogeneous individual evaluations to form a larger legitimacy discourse at the aggregated level. Individual evaluations (Tost, 2011) are not formed in isolation but together with others (Schnider et al., 2018). To know more about how legitimacy is formed in interactions, we must zoom out (i.e., enlarge our interpretative lenses) in order have a wider view on the legitimacy process. We need to go from studying dialogues at the microlevel (e.g., dyadic conversations) or cognitions at the individual level (Schnider et al., 2018) to start exploring the characteristics of the hypertext space where these interactions occur (Barros, 2014; Zappavigna, 2011, 2018, 2019). Such zooming out allows us understanding that not all frames shared in microconversations have the same relevance to help individuals co-situate legitimacy evaluations with one another (Tost, 2011) into one coherent discourse. Only those aggregative frames that are repetitively co-situated with a high volume of evaluations in a hypertext allow this. In our study, these aggregative frames are hashtags in tweets that have achieved a high level of refracted attention.
Methodological Contribution: Finding the Tipping Point With TVAR
Although we applied conventional analytical techniques such as TVAR models (Tsay, 1998) and semantic network analysis (Bonini et al., 2016; De Nooy et al., 2005; Illia et al., 2016, 2021, 2022), we believe that our study provides a combination of procedures and steps of analysis that have been underused in business and society research to date. First, our study demonstrates how to use semantic network analysis (via conversion of a one-mode to a two-mode network) to operationalize convergence in the legitimacy process and thus to grasp how legitimacy is discursively expressed. Second, our study shows how to use TVAR models to analyze how legitimacy judgments expressed by citizens may influence legitimacy of other important actors. We suggest using sentiment in social media as an independent variable, institutional actor legitimacy as a dependent variable, and a threshold variable that explains the significant or nonsignificant causality of the former on the latter. The use of TVAR models is advantageous from an empirical standpoint because, as an extension of VAR models, TVAR allows us to model endogeneity and nonlinearity while testing relationships among variables, which are common problems for social media data and data expressing an evaluation (positive and negative). Furthermore, the use of TVAR is valuable from a theoretical viewpoint because it allows us to operationalize the tipping point empirically. This shows that not all tweetstorms have the same influence on institutional actors, only those that reach a certain threshold of aggregation. This suggests the importance of studying not only citizen’s legitimacy evaluation (i.e., the sentiment) but also the degree to which these evaluations become influential not because there is massive participation but rather because there is an emerging discourse that aggregates heterogeneous evaluations.
Limitations
Although our analysis examines the case of an organization being subject to evaluations that may be expressed for other organizations in contested industries, it is based on a single case study, so clearly requires replication to increase the transferability of our findings. The tipping point may differ in intensity across different types of organizations, industries, and cultural contexts, as well as in situations other than organizational scandals. It would be interesting to explore whether the threshold of refracted attention is similar in other cases.
Our study successfully tests the impact of legitimacy in social media over media legitimacy, but further research needs to explore the broader impact for legitimacy expressed by other macroinstitutional actors such as regulators or governments. We conducted our study with a focus on the media landscape that has been particularly affected by the rise of social media (Etter et al., 2019; Wang et al., 2021). We expect regulators or the legal system, for example, to be influenced in different ways by individual evaluations expressed in social media (Bitektine & Haack, 2015).
Future Research
Although refracted attention allows heterogeneous evaluations to emerge as one clear discourse, their convergence is not a monolithic echo-chamber (Colleoni et al., 2014) of evaluations that reflect and reinforce one’s own, but a refraction chamber (Michailidou, 2017; Rieder, 2012) of networked messages. These spread independently of the underlying infrastructure of networked actors, being motivated by self-publicity (Bennett & Segerberg, 2012; Colleoni et al., 2021); also they spread independently of a common ideology influenced by the establishment of a follower–followee relationship and selective exposure. It would be interesting to analyze the characteristics of such refraction chamber, in which the imagery about an organization (Van Bommel & Spicer, 2011) emerges from refracted attention, i.e., a refracted voice. This would enable us to understand whether and how refracted attention is relatively immune to manipulation because it emerges discursively.
Furthermore, our study did not test how the reaching of a tipping point of refracted attention may also influence organizational strategies and values or even the norms of an industry (Barberá-Tomás et al., 2019), nor how organizations can pre-emptively identify the rise of this tipping point. This would allow organizations to better understand and respond to citizens’ demands, as suggested by Barnett and colleagues (2020). It would be interesting to explore longitudinally how only once a tipping point is reached, organizations start to adapt their strategies and values. This would entail studying cases of firms’ crisis or stability, to identify different typologies of tipping points, as they may be related to positive as well as negative sentiment expressed online.
Finally, although this article does not aim to test empirically the measure of refracted attention versus the more traditional measure of attention, our findings invite researchers to further study their relationship over time. First, it would be interesting to study the temporal relationship between these two variables, since volume affects the news media sphere over time. Second, researchers could critically re-discuss volume as the measure commonly used to assess the influence of social media evaluations because the role of volume and number of individuals participating in Twitter storms seems less crucial than has been postulated.
Conclusion
Testing the influence of social media legitimacy evaluations is challenging. It requires a theoretical and empirical approach that hypothesizes and tests when (rather than whether) individuals’ matter. We hope our study inspires scholars to propose and test other thresholds that may explain why these evaluations are influential after a certain tipping point. Also, we hope it inspires new studies capturing the discursive legitimacy process, which we believe is crucial to understand how citizens’ legitimacy evaluations are not formed in isolation from others.
Footnotes
Appendix
Acknowledgements
The authors are extremely grateful to the editor and reviewers of Business and Society for their suggestions and constructive critiques. In addition, they would like to thank Patrick Haack and Stelios Zyglidopolus and the participants of the Oxford Reputation Center Conference 2019 for their useful comments on a previous version of this manuscript. A final thanks goes to Anika Clausen, who supported them for the final phase of their manuscript formatting and revision.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the UniCredit & Universities Foundation Individual Grant and by the BBVA Foundation Individual Grants.
