Abstract
The Weibo platform is a social space for interaction and expression. This requires scholars to examine, in a simultaneous fashion, communication patterns and the communicated content among Weibo users. Based on theories of ‘network and culture’ and relational sociology, this article contends that network fields and the communicated cultural meanings are mutually constituted. A latent Dirichlet allocation (LDA) topic model and social network analysis techniques were used to examine 51,288 Weibo posts published by users concerned for workers revealing the relationship between community structures and communities’ focal topics. Specifically, the result of LDA topic modeling shows that the focal topics regarding labor issues could be categorized into four groups: workers’ culture (art and entertainment) and welfare; predicaments and problems; strikes (rights defending actions) and labor organizations; and institutions and labor rights. Analysis of interaction patterns among users resulted in the identification of five major online communities which, based on the primary communicated topics within communities, were labeled as the Labor Homeland Community; Labor Culture Community; Labor Rights Protection Community; Labor Interest Concerned Community; and Labor Institution Concerned Community. The results also showed two new trends in relation to labor issues: first, workers’ culture and their integration into urban life have garnered increasing online attention with the growth of new generation workers; and second, the Weibo platform provides an interaction channel for labor researchers and labor non-governmental organizations, and such interaction facilitates the latter to critically reflect the current conditions or plights of workers from an institutional/structural perspective. This article concludes with a discussion about the significance of utilizing big data analytics to study online culture and social mentality.
Keywords
Introduction and research questions
The advent of the Internet has provided an enabling space for the development of social organizations. Specifically, it empowers resource-poor organizations by providing a tool for self-representation, networking, public mobilization, and the framing of alternative discourses (Sima, 2011; Yang, 2003). Existing studies have mainly investigated the social media usage and practice of non-governmental organizations (NGOs) by means of three different theoretical approaches. The first approach was to examine the online influence of NGOs through the perspective of digital inequality. NGOs’ usage of social media is influenced by the organizations’ strategy, capability, governing structure and external pressure (Nah and Saxton, 2013). The online influences of their Weibo posts are, to a large extent, shaped by the topics and their online features, while the effects of the offline organizational traits are much smaller (Huang and Gui, 2014). Meanwhile, online networks between organizations are impacted by organizational legitimacy, offline cooperation, spatial proximity, and homophily of interest (Huang et al., 2014).
The second approach was to investigate social media’s empowering function for collective actions through the perspective of Internet politics. Existing case studies have shown that NGOs used Weibo to share information, broadcast the development of protest events, communicate with traditional media and general Internet users, frame protests, and mobilize the public, and thus they coordinated long-lasting collective actions and empowered themselves (Chen, 2014; Chen and Zhang, 2015). A case study on the NGO Love Save Pneumoconiosis (da’ai qingchen) illustrated labor organizations’ discursive politics in the space of Weibo. On the one hand, they developed an alternative discourse to interpret the situation faced by workers. On the other hand, polyphonic expressions were adopted to legitimize their works and to politicize the issue of pneumoconiosis (Gleiss, 2015).
The third approach concentrated on NGOs’ usages of social media. They use social media to disclose information, build community, and motivate the public (Lovejoy and Saxton, 2012). A study found that in the US, Twitter is used mainly for one-way communication (Lovejoy et al., 2012). In China, NGOs use Weibo in a similar manner, but informal expressions and tweets unrelated to the organizations’ missions are more common (Zhou and Pan, 2016).
Despite the fact that existing literature has analyzed the NGOs’ usages of social media from the above-mentioned perspective, more studies are still needed. First, previous studies have tended to see social media as an instrumental resource (e.g., online influence and mobilization structure) which isolates the form and content of online communication (e.g., they tended to focus either on network relations or discursive competition), and thus failed to present a holistic picture of inter-organizational communities. In this age of social media, Chinese Internet users are extremely diverse, and they differ in patterns of social media usage and online opinions on a range of social, political, and diplomatic issues, a fact which should not be neglected (deLisle et al., 2016). As a result, conclusions of case studies or studies on specific aspects of social media usage might be biased. Moreover, social media is not only an information-driven medium, but also a polycentric social space (Cavanagh, 2007; Fenton, 2012). The conception of social media as a social space indicates that nuanced understandings of user communities and the associated online public space cannot be reached without jointly investigating the online interactions between users and the symbols, identities, and cultures embedded in the interaction process. Furthermore, previous studies mostly used case studies and traditional content analysis methods and failed to take advantage of social media big data. This prevented them from understanding NGOs and their social media usages or online expressions in a holistic manner.
Based on the above discussion, this article aims to investigate Weibo users who are concerned about labor-related issues by analyzing Weibo textual big data through topic modeling and social network analysis methods, and thus to present a holistic big picture of the labor rights community. Specifically, this article investigates: (1) the community structure among labor rights-concerned Weibo users; (2) the primary topics expressed during their online interactions and communications; and (3) the relationship between inter-community interactions and community cultures manifested in the expressed topics.
Using big data analytics to analyze these questions is of great significance. First, as NGOs concerned with labor rights have developed and differentiated, their social media usages have becoming increasingly diverse. A bird’s-eye view of their social media usage helps us to evaluate, in a more accurate manner, social media’s empowerment effects for NGOs. This not only avoids over-emphasizing the significance of particular types of social media usages, but also helps us to discover new trends of social media usage. Second, this article preliminarily demonstrates the potential of topic modeling, an unsupervised text analysis technique, in Internet culture studies, and showcases the potential for sociologists to embrace big data in practice (Sun and Shi, 2016). This article suggests that, besides using word frequency extracted from existing big data corpora (Chen, 2015; Chen, et al., 2015), scholars may also explore the cultural meanings expressed in the Internet space by conducting semantic analyses of raw textual data on social media platforms. Third, previous studies mainly took advantage of the chronological trends of topics and the relationship between topics and other features of the texts in order to effectively interpret and verify results of topic models (DiMaggio et al., 2013; Levy and Franklin, 2014). In addition to investigating the chronological trends of topics, this study innovatively conducts an analysis of the association between topics and networking patterns among users. This approach bridges the structural and cultural analyses of social networks and thus avoids the tendency to analyze social networks in a highly formal manner (Fuhse and Mützel, 2011; Huang et al., 2014).
Network fields and cultural identities in the Weibo space
This article regards social media as a social space and investigates user communities concerned about labor rights through the perspective of community structure and culture (Cavanagh, 2007). In the digital age, online communities manifest as patterned connections between Internet users (Cavanagh, 2007), while community identity is presented in the texts and symbols published by community members (Tamburrini et al., 2015). This conception indicates that this article can draw upon the theoretical insights of relational sociology to study online communities. Relational sociology argues that cultural forms and social networks are not two autonomous concepts, and therefore cannot be measured separately, nor can the measures of these two concepts be used to investigate the causal relations between them. On the contrary, social networks are formed in the cultural process of communication (Mische, 2011; Mützel, 2009). According to Mützel, the relations between actors and the associated discursive interactions or ‘stories’ consist of ‘netdoms’ in competition with each other, while identities rise through contingent interactions and competition between different netdoms (Mützel, 2009). In other words, the structural and cultural dimensions of relations are mutually constituted. Netdom switching generates cultural identities through the process of comparison and reflection and thus leads to the emergence of ideas and meanings. In this theory, culture is shared by communities in a fluid manner instead of existing in the abstract (DiMaggio, 2011), while the structure of social groups is the result of the formation or termination of relations between actors.
Big data analyses of social media have, to a certain degree, supported the above theories. For instance, the ideological leanings of politicians can be estimated by studying the patterns of connections in their Twitter networks (King et al., 2016). A study on Facebook also showed that the ideologies of politicians and their supporters can be estimated based on individual citizens’ online endorsements of political figures (Bond and Messing, 2015). A quantitative analysis of Twitter users also demonstrated that communication styles such as the frequency of word usage change according to the intended audience communities (Tamburrini et al., 2015). However, this study only demonstrated that different online communities share different cultures, and did not explore the meanings of the communicated texts and their associations with patterns of online networking.
With the development of computational social science, the latent Dirichlet allocation (LDA) topic model has provided scholars with a valuable tool for exploring cultures and frames (Blei, 2012; DiMaggio et al., 2013). Using topic modeling, scholars can extract meaningful discursive frames from an enormous corpus of textual data. For instance, Nowlin (2016) used topic models to analyze policy texts and extracted multi-dimensional, competitive issue frames, and further investigated the association between issue frames and changes in the policy environment. Topic models can estimate the topic distribution in every text, and accordingly, the popularity of topics at the corpus level can be aggregated. Therefore, topic modeling provides an empirical basis for investigating the link between micro-level and macro-level issue frames (Nowlin, 2016). Given that the formation of community identity and culture are the result of members’ use of specific discursive frames, and the communicative interactions between community members can be seen as a network field, joint analyses of communicative cultures using topic modeling and interaction patterns using social network analysis techniques can enhance our understanding of the cultural process undergirding online communities.
In sum, the conception of social media as a social space suggests that researchers can investigate online social networking and the communicated texts and symbols from the dual perspectives of network field and cultural identity. On the one hand, the investigation of discursive symbols communicated in social media users’ online interactions enables one to transcend the pure structural analyses of social networks, and to enrich our understanding of the cultural aspects of online communities. On the other hand, by taking the network field in which discursive texts and symbols are communicated into consideration, scholars might be able to interpret the social meaning of texts and symbols in a contextual and more accurate manner. Moreover, in the Web 2.0 era, the vast amount of data generated on social media platforms and the advancement of textual analysis techniques provides sociologists with an unprecedented opportunity to study online cultures and online social networks (Evans and Aceves, 2016). Taking Weibo users (NGOs) who are concerned with labor rights as a case, this article investigates the social networking patterns among users and the communicated discursive texts and symbols, and thus preliminarily offers a holistic picture of the social media usages and online expressions of NGOs concerned with labor rights. Based on the above discussions, this article aims to answer the following questions:
Q1: What online communities were formed during Weibo users’ online interactions?
Q2: What are the major issues (topics) communicated within each community?
Q3: What are the associations between cross-community interactions (networking) and community cultures?
Data and methods
Fourteen NGOs were selected that were concerned with labor rights as seed users after which 51,288 Weibo posts containing the seed users were filtered out from the Social Media Processing 2015 Weibo Data Set. Topic modeling, a machine learning technique, was used to identify topics expressed in the selected Weibo corpus (Blei, 2012; DiMaggio et al., 2013; Jacobi et al., 2016; Nowlin, 2016). In topic models, a topic is a statistical distribution of words, and words of a topic are more likely to co-occur in a document (or a Weibo post in this study). Topic modeling estimates the probabilities of words belonging to a particular topic, and researchers can interpret their meanings by inspecting the most likely words (those with high probabilities of belonging to the topic). Each document can discuss multiple topics, and the prevalence of topics is described by a probability distribution. In this sense, LDA topic models is a mixed membership model (Grimmer and Stewart, 2013). Since it is common for a Weibo post to discuss two or more topics, using a mixed membership model means that it is not necessary to make a, sometimes arbitrary, binary decision as to whether a Weibo post only discusses a particular topic. Thus, the model can describe the topical characteristics of Weibo posts in a more accurate manner. A fitted topic model can also predict the topic distributions in each document; therefore, the author can select typical posts of a particular topic to evaluate the topical validity of the modeling results.
Topic modeling proceeded according to the following steps. First, the textual data were preprocessed, including word tokenization and feature selection. The most common procedure of feature engineering is stop word filtering. In addition, named entity words (Burscher et al., 2016) and Weibo usernames were also filtered, because exploratory modeling showed that keeping these two types of words led to identifying online controversial events instead of issues and frames as topics. Next, the number of topics was determined and the final topic models were estimated. The methodological literature recommends that the number of topics be determined based on the statistics of perplexity and coherence, as well as the interpretability of topics. A smaller value of perplexity indicates a better model, as does a larger value of coherence. However, perplexity tends to favor unnecessarily complex models, and thus a judgment call had to be made based on perplexity and coherence. The ‘c_v coherence’ was used in this study because it highly correlates with human evaluation (Röder et al., 2015). Third, the validity of the resultant topics was manually evaluated and further analyses were conducted on topics with high levels of validity. The research goal was also taken into consideration in the process of determining the number of topics. As Nowlin (2016) argued, specific topics are preferred when studying the framing strategies of particular actors, while general topics are preferred when studying issue definition at the collective level.
Along with topic modeling, an interaction network was constructed based on the mention relations (‘@username’) embedded in Weibo posts. If user A directly ‘@’ B when posting or reposting a Weibo tweet, then an edge from A to B exists in the interaction network. It is worth mentioning that only direct mentioning was counted in the edge weights, while cases of A reposting B’s Weibo tweets were excluded. Since each user may @ other users for multiple times in a range of posts, the frequency of direct mentioning was defined as the edge weights of the interaction network. Following the above procedure, in this study, a directed weighted network was constructed consisting of 14,730 nodes (users) and 41,202 edges (mention relations).
Because infrequent mentions imply random or unstable community identities, infrequent mention relations were filtered out. To determine a reasonable cutting point, the chronological distribution of Weibo posts was analyzed. Results showed that the monthly number of posts exceeded 1000 in 21 months. Assuming that the monthly mention frequencies being equal to or larger than 1 in the active 21 months indicates a long lasting and stable online relation, one may use 21 as a cut-off point for frequent interactions. After mention relations with edge weights below 21 were eliminated, most users (14,085) became isolated nodes, a few users (23) consisted of network components with sizes from two to five nodes, while the remaining 622 users constituted a giant network component. Community detection on the giant network component resulted in five communities with more than 50 members.
Finally, to explore the topic prevalence in the five communities, Weibo posts were grouped into five categories according to whether members of each community participated in Weibo posts, the topical distribution of each Weibo post was calculated, and the average topic possibilities for each community were then summarized. The group-wise average topic possibilities also provided a solid empirical basis for this study to interpret the cultural identity of each community.
This study used Python for data processing and analyses. Topic models were estimated by the gensim package, and social network analyses were conducted using the python-igraph package.
Findings
The Weibo corpus analyzed in this study was posted between October 2010 and March 2014. A chronological distribution shows (Figure 1) that few Weibo posts were posted before 2012, during which the monthly number was less than 400. Since 2012, the numbers of posts increased gradually, although fluctuation can be observed. Specifically, the number of posts in March 2012 exceeded 500, and the trend of increase continued until September 2013 (4662 posts) after which the number began to rapidly decrease. This downward trend might correlate with the decline of Weibo’s popularity. Meanwhile, the decrease in Weibo posts in February and March 2014 may also have resulted from the fact that data collection terminated at about the same time. Therefore, data from these two months were excluded in further analyses.
Monthly distribution of Weibo posts.
Selection of topic models
To determine the number of topics, a series of topic models were estimated with the number of topics ranging from 2–40, and candidate models were then preliminarily selected by calculating and comparing the values of perplexity. Results showed (Figure 2) that the perplexity values decreased when the number of topics increased, which was consistent with previous studies (e.g., Jacobi et al., 2016). When the number of topics exceeded 12, the decrease of perplexity values became relatively small. The coherence values of models with the number of topics ranging from 4–30 were then calculated, and the results suggested that when the number of topics was 7, 9 or 13, coherence values were relatively large. Based on perplexity, coherence and the interpretability of the resultant topics, the 13-topic model was chosen as the final model.
Model comparison and selections.
What issues (topics) were the labor communities concerned about?
Result of topic modeling (10 most important words).
Notes: only topics with identifiable meaning and clearly related to labor (organizations) were reported. Less relevant topics include: (a) Chengguan, Little, This, No, Court, Innocent, Pioneer, Violence; and (b) Just, Unable, Lack, Thinner, Always, Nevertheless, Power, Parade, Hour, Freedom. Topics without clear meaning include: Dispatch, Labor, Unwilling, Dream, Not, Suggestions, See, Government, Forever, Vote.
Labor culture (art and entertainment) and public welfare
Topic 1 included keywords such as performance, labor, art troupe, youth, music, ballad, songs, share, and click (Table 1). Keywords such as performance, art troupe, music, ballad, and songs demonstrated that topic 1 mainly concerned songs or performances that reflect the lives of migrant workers or consider them as audiences. One Weibo post read: ‘To our painful, stubborn and unconstrained youth, these songs are for the sheltered youth, the despaired youth, the youth that grew under restriction’. It showed that ‘youth’ is an important theme for workers’ art. Accordingly, youth was an important keyword in topic 1. Some posts further reflected upon the youth of migrant workers and criticized the reality. One post said: ‘The rights protection protests among southern China workers are similar to what happened during the May Fourth Movement. Here is my newest song called ‘Labor & Youth’, which was sung for our sheltered youth’. Some singers also emphasized that what they sang was about ‘our reality with a tactile sensation’. Keywords such as share and click demonstrated that the new generation of migrant workers were internet-savvy in sharing and promoting songs and performances of the worker communities. Expressions such as ‘Listen to it, it is still fresh’, ‘Click to play’, and ‘The video of performance will be uploaded later’ were common. The distribution and trend of topics (Figure 3) shows that the topic of workers’ arts was slowly gaining attention, but that the overall prevalence was not high.
Temporal trends of topics.
Primary words of topic 2 included Spring Festival Gala, public welfare, everyone, support, kids, friends, one, community, children, and activities. The phrase Spring Festival Gala implied that an important aspect of topic 2 was the discussion of the workers’ Spring Festival Gala. One post read: ‘This is the full-length version of the 2013 workers’ spring festival gala, please view and share it. The workers’ spring festival gala: a gala for 300 million workers!’ Another post described the orientation of the workers’ spring festival this way: ‘The spring festival gala launched for 300 million Chinese workers is now launched. With new workers as the subject, this gala promotes the values of labor and strives for justice and equality of society. Workers’ spring festival gala: a gala for workers!’ While sharing comments about this gala, Weibo users also sent New Year greetings to ‘everyone’, and called upon ‘friends’ to widely repost the video, which made everyone and friend two primary keywords of topic 2. Topical prevalence showed that posts about the workers’ Spring Festival Gala reached their peak in January 2013 and 2014, which was consistent with the timeline of the activity. The words public welfare probably appeared in this topic both because organizers of the workers’ Spring Festival Gala identified themselves as a public welfare organization and because participants of the gala were active in public welfare activities in their daily lives. Moreover, the gala itself was once reported by the Chinese Public Welfare (zhongguo gongyi) television program. Children and community were common topics of labor welfare, which explains why they appeared in this topic.
Predicaments and problems of workers
Keywords of topic 3 such as leukemia, pneumoconiosis, life, and treatment implied that the topic dealt mainly with occupational diseases of workers, while keywords such as help, assistance, and good-will people suggest that Weibo users tried to call for social support on the Weibo platform. One Weibo user wrote: ‘Repost again. Hope more people know about the miserable lives of pneumoconiosis migrant workers!’ The Weibo account of Sohu (a well-known news portal in China) also wrote: ‘Accompanied by reporters from various news agencies, we went to a suburban village in Shunyi to investigate the living situations of pneumoconiosis migrant workers … Interviews and investigation took part in open space and on road sides. Such an interview was, to some extent, special.’ These representative Weibo posts were consistent with Gleiss’s (2015) study. However, analysis of topical trends showed that the topic of occupational diseases accounted for a low proportion and did not become a labor issue that widely attracted public attention.
The most typical words of topic 4 included society, problem, city, migrant workers, and life. These words discussed the social problems caused by the institutional separation between urban and rural. One Weibo user wrote: ‘I reside in the city, but I am looking forward to the village, because my kid and parents live there’. In addition, some Weibo posts directly pointed out multiple sources of stress faced by left-behind mothers, or a dilemma facing new migrant workers when they returned to their hometown: ‘When new migrant workers return to their hometown to make a living, they have two choices, to live by farming or continue to live by physical labor. If they choose to live by farming, this new generation of migrant workers face the problem of lacking farmland and farming skill’. Other posts described new generation workers’ desire to blend in with metropolitan life: ‘Our generation had no substantive connection with rural areas despite having a rural hukou … We developed a sense of attachment to the city, but without a sense of belonging’. Meanwhile, a small proportion of posts analyzed the institutional barrier that hindered migrant workers from blending into the city: ‘China’s modernization process discriminated against farmers unconsciously. Urbanization accepts farmland, but not farmers without their land. Industrialization needs the labor of migrant workers, but not migrant workers themselves as citizens’. Those posts analyzed the complex relations between migrant workers, rural and urban, and therefore this study labels topic 4 ‘urban inclusion’. The topical trend suggested that the topic of ‘urban inclusion’ was prevalent at an average level, but showed an apparent increasing trend before September 2013.
Topic 5 is made up of high probability words such as society, migrant workers, issue, institution, school, student, and occupational injury. In contrast with topic 4, although topic 5 also mentioned problems faced by migrant workers, it discussed them through an institutional perspective rather than focusing on urban–rural relations. Of these posts, education of migrant workers’ offspring was the center of the discussion, as demonstrated by the topical prevalence during the period from June 2012 to July 2012. Analysis of posts showed that the Beijing Tongxin Elementary School for Migrant Workers’ Children (Tongxin School) was shut down by the local government in June 2012, which ignited netizen discussions calling for the protection of Tongxin School and raised concerns about subsequent arrangements for the migrant workers’ children. One Weibo user posted this statement: ‘I hope that more people pay attention to the rights protection of migrant workers and the issue of their children’s education!’
Labor organization and rights protection
Topic 6 mainly discussed workers’ rights protection actions. The primary words such as employee, factory, representative, company, employer, and police indicated three types of actors in rights protection actions, namely labor, capital and police, while words such as negotiation and demand(s) described the negotiations between the parties of labor and capital. The topic trend showed a significant rise in the second half of 2013 and a decline in early 2014. Analysis of Weibo posts associated with topic 8 showed that support of workers’ representatives was a salient theme. In the subsequent development of those protest actions, worker representatives were prosecuted, and netizens and ‘defense lawyers’ of the prosecuted actively expressed themselves on the Weibo platform.
Words such as labor, labor union, organization, and representatives suggested that topic 7 focused on labor organizations and representatives. Analysis of Weibo posts showed that the words labor union reflected the fact that workers hoped labor unions could fight for their rights during rights protection actions. If the existing labor unions failed to do so, it was hoped that a new labor union could be formed by direct election, or worker representatives could be elected to protect workers’ rights. Some Weibo users wrote about several cases of directly elected labor unions in Shenzhen’s enterprises: ‘Hereby I would like to share my thoughts and suggestions about the practice of such a direct election, through which I hope to build a solid foundation for the workers’ freedom to build associations and to prevent the direct election of labor unions from becoming a fraud’. The fact that workers, enterprises and labor unions were the primary participants of collective negotiation also explained why enterprise and labor union were high probability words for this topic. Words such as demand, salary, negotiation, and organize showed that topic 7 discussed how workers’ representatives organized workers to protect their own rights, or to negotiate with their employers on issues such as salary, overtime work, and social insurance. Analysis of the word negotiation conveyed the following relations: on the one hand, workers’ inadequate negotiating capability drove them to protect their own rights by collective actions; and on the other hand, collective actions strengthened workers’ negotiating capability. For example, one post mentioned, ‘Workers may consider ending their strike, given that the employer makes concession to other demands’. By comparing topics 6 and 7, we find that although both topics talked about rights protection, the former one focused on rights protection actions and events, while the latter emphasized the importance of workers’ organizations and representatives in protecting workers’ rights. One post suggested that workers ‘band together and negotiate collectively to protect their rights’, and at the same time also ‘strategically build a solid foundation to strive for long-lasting and effective communication between labor and capital’. Figure 3 shows that the topic ‘labor organizations’ had been steadily gaining popularity between 2012 and 2013, but gradually decreased since late 2013.
Topic 9 discussed a particular case of labor dispute. Specifically, a workers’ representative in a Shenzhen company was detained by the police for more than a year, and his family and coworkers called for public attention on the Weibo platform. Words such as liberty, law, and human rights clearly revealed the perspective of Weibo users on this dispute.
Institution and labor rights
Top words of topic 10 included nation, democracy, society, people, and politics, which indicated that this topic tended to discuss political issues such as socio-political institutions and political systems. Some Weibo posts directly discussed political issues but did not connect them with labor issues, while others talked about labor issues through the political and institutional perspective. Analysis of topic distribution showed that topic 10 amounted to a small proportion.
Community structure and cultural identity
As mentioned previously, using the community detecting technique to analyze the interaction network between users resulted in the identification of five communities with more than 50 members. Visualization of the social network (Figure 4) showed that frequent interactions could be observed within each community, while interactions across different communities were much less common. This section will integrate the results of community detection and topic modeling so as to explore the levels of attention that each community paid to different topics, and thus to understand their latent cultural identities manifested in their online expressions.
Visualization of the social network.
Analyses of cross-communities interactions.
Note: the numbers without parentheses were users who participated in cross-community interaction (from the community on the row to the community on the column); the numbers in parentheses indicate the frequency of interactions.
Analyzing the structure of the Labor Homeland Community found that users who were frequently mentioned included @SHWLDZGC, @New Labor Net (xin gongren wang), @WDZKH, @Tongxin Experimental School (tongxin shiyan xuexiao), and @Labor Union Home of Workers (gongyouzhijia gonghui), while users with the highest degrees included @SHWLDZGC, @WDZKH, @New Labor Net, and @Tongxin Experimental School. User profiles showed that: @SHWLDGC was a worker and singer who created the Home of Workers’ New Worker Art Troupe (gongyouzhijia xingongren yishutuan); @Tongxin Experimental School was a philanthropic school for children of migrant workers, which was established with the assistance of many types of social forces; @NewLaborNet was a website founded by a Beijing company called Tongxinhuhui Ltd. (tongxinhuhui kemao gongsi); @WDZKH was a staff of the Home of Workers’ and the head of Tongxinhuhui Public Welfare Store; and @Labor Union of Workers identified itself as the common homeland for all migrant workers. Through analysis of these user profiles, two preliminary conclusions were reached in this study: first, formation of online interaction communities is profoundly influenced by offline relations between community members – this conclusion is consistent with a previous study (Huang and Gui, 2014); and second, traits of core community members to some extent influenced a community’s cultural identity and the interaction patterns with other communities.
Community 2 (diamond-shaped node on Figure 4) consisted of 96 users, with its members particularly interested in topics of rights protection actions, rights protection representatives, and labor organizations. Thus, this study calls it the Labor Rights Protection Community. Analysis of its community structure showed that users @JJDCS, @ZFYWB, @Editor of the Collective Negotiation Forum (jiti tanpan luntan xiaobian), @HHG, and @PPS were the most frequently mentioned, while core users with the highest degree included @ZFYWB, @Service Department for Zhongshan Migrant Workers, @GJZHWQC, @CDGM, and @JJDCS. @JJDCS mentioned in his (her) personal profile that he (she) was willing to fight for labor rights. @ZFYWB was a staff member at the service department for migrant workers at Guangzhou’s Panyu District and identified himself (herself) as a rights protection citizen. According to Caixin Net’s reports, @ZFYWB took an active role in the organization of several labor protests, and @HHG was another user who cared about labor rights and called for the building of multilateral relations between labor, capital, and government through collective negotiations. As the user nickname revealed, @Editor of the Collective Negotiation Forum was committed to the promotion of collective negotiations. As labor rights protection and collective negotiation often involved a labor union, @Guangdong Labor Union (Guangdong gonghui) was also frequently mentioned by community members. This finding showed that it was reasonable to identify the word labor union as a keyword in the labor organization topic.
Analysis of community 2 as a single social network showed that its network transitivity was 0.154 (smaller than that of the Labor Interest Concerned Community but larger than that of others), which suggested that this community had high levels of within community interaction. Further analysis also confirmed (Table 2) that members of community 2 primarily interacted with members of this community, while the total number of interactions with other communities was less than 1,000. Relatively speaking, the Labor Rights Protection Community (community 2) interacted most frequently with the Labor Interest Concerned Community, while the Labor Institution Concerned Community ranked second, and the Labor Homeland Community and Labor Culture Community the least. This pattern of cross-community interaction was profoundly influenced by the compatibility of cultural identities. To sum up: (1) personal profiles of community core members were consistent with the topics of greatest concern to a community, and this to some extent proved the validity of the topic modeling result; (2) similar to previous studies’ findings (e.g., Chen, 2014), labor organizations had tried to protect their rights through the Weibo platform, and protest participants formed close-knit communities during the actions; and (3) the boundary of online communities was both fluid and stable in the sense that cross-community interactions existed, but this type of interaction was more likely to occur between communities with compatible cultural identities.
Community 3 (square nodes on Figure 4) had 91 members, and they were most interested in topics of labor arts, the Spring Festival Gala, and public welfare. Accordingly, this study named it the labor culture community. Frequently mentioned users were @XGRYST-XD, @OCSSSDNJ, @YJL, @YZWQ, and @MAKO (maque washe), while users with the highest degree were @XGRYST-XD, @Heartbeat on the Left-Ballad Community (xintiao zai zuobian – minyaohui), @HQStudio (HQ gongzuoshi), @0CSSSDNJ, and @XJLANG7. Among them, @XGRYST-XD was the creator of the New Worker’s Art Troupe, and @Heartbeat on the Left-Ballad Community, affiliated with the Beijing Workers’ Home, was a typical user who was committed to spreading labor culture. @YJL was a reporter who once hosted the workers’ Spring Festival Gala, and his Weibo posts sometimes mentioned topics of labor rights’ protection and the workers’ Spring Festival Gala. @MAKO was the official account of a live house, and @HQ Studio was a music/art studio. Although the primary concerns of these two users were not directly orientated towards labor issues, their artwork and performances inextricably linked with labor arts and performances. Further analysis showed that the Labor Culture Community’s transitivity score was 0.010, significantly smaller than that of other communities, which suggested that this community had a loose structure. In other words, although community members paid attention to similar topics, they did not frequently interact with each other, perhaps because of two aspects of their situation: (1) popularization of labor culture inevitably required a large audience, and a close-knit online community might be disadvantageous; and (2) cultural communication relied on diverse channels and segmented market niches, which might hinder online communication among the key community members. Of course, these two explanations require further empirical examination.
Community 4 (triangular nodes on Figure 4) had 88 members, who were inclined to discuss labor issues through an institutional perspective. It seems that some community members who were labor rights activists attempted to develop alternative discourses different from the mainstream discourse to interpret workers’ situations, and therefore this study called it the Labor Institution Concerned Community. Examining the individual community members revealed that many core members were scholars or lawyers. Specifically, the most frequently mentioned users included @WJS, @Concerned for New Generation Migrant Workers (guanzhu xinshengdai nongmingong), @WKQ, @JXG, and @XT, while users with the highest degree included @WJS, @SYLJW, @ Concerned for New Generation Migrant Workers, @LJJ2012, and @CNHKDC. It is worth mentioning that @ Concerned for New Generation Migrant Workers was a joint account of nine scholars at several Beijing-based universities, who set up this Weibo account to discuss labor issues after nine consecutive incidents of Foxconn workers committed suicide by jumping off buildings. @XT was a lawyer, @CNHKDC was a scholar concerned with labor rights issues, and @JXG was a reporter. Overall, the involvement of lawyers, scholars, and reporters might help Weibo users concerned about labor rights to attribute and diagnose workers’ miserable situations through an institutional perspective. Nevertheless, the topic of institutional analysis only accounted for a small proportion, and the Labor Institution Concerned Community lacked a close-knit network structure (the network transitivity score was only 0.065). Analysis of inter-community interactions showed that the Labor Institution Concerned Community mainly communicated with the Labor Rights Protection Community. However, interactions from the Labor Institution Concerned Community to the Rights Protection Community were much more common than vice versa. It seems that the institutional perspective had not yet resonated with or attracted much attention from other labor communities. To sum up, although online institutional discussions about labor issues were of great significance, the Labor Institution Concerned Community possessed a low level of online influence.
Community 5 (inverted triangle nodes on Figure 4) had a size of 75, and its members discussed diverse topics. Relatively speaking, community members were interested in the topics of strikes and rights protection actions, labor organization, and urban inclusion. Although this community also focused on topics such as labor negotiation and rights protection, they were mostly interested in the general rights of workers, and therefore this community was labeled the Labor Interest Concerned Community. The most commonly mentioned members were @Village Near City (chengbiancun), @Center of Migrant Workers (dagongzhe zhongxin), @QH17, @Shenzhen Xiaoxiaocao Labor’s Home (Shenzhen xiaoxiaocao gongyou jiayuan), and @HBMG. @Center of Migrant Workers was an NGO set up by injured workers aiming to protect workers’ rights and interests, and alleviate labor–capital conflicts. @Shenzhen Xiaoxiaocao Labor’s Home was a public welfare institution serving industrial workers, which provided workers with cultural and legal aid. @QH17 was a volunteer editor for the Worker Pioneer Net. @HMBG was a migrant worker in Dongguan. These profiles showed that this community possessed an obvious regional trait – most of the key users came from Guangdong Province, a breeding ground for labor organizations in China. In addition, this community shared the following two features. First, although community members were concerned with workers’ rights and interests, they only interacted with the Labor Rights Protection Community at a moderate level (600 cross-community interactions). Perhaps this was because the Labor Interest Concerned Community mainly focused on the survival and development of labor organizations, and they opted to protect workers’ rights and interests through organizational channels instead of direct actions. Second, its network transitivity was 0.199, higher than all other four communities, which indicates that frequent within-community interactions existed. A possible explanation is that limited political space pushed labor organizations to support each other.
Ordinary least squares regression analysis of the interaction patterns' effects on communicated topics.
Note: ** < 0.05.
Conclusions and discussion
This study used big data analytics to explore, albeit in a preliminary manner, the interaction structure and expressed topics among users who were concerned with labor issues in the Weibo space. LDA topic modeling was used to analyze 51,288 Weibo posts and it was found that the labor concerned users discussed topics of labor arts, the labor Spring Festival Gala, public welfare, occupational diseases, urban inclusion, migrant worker problems, labor rights protection actions, labor organizations, and rights protection representatives. It seems that some topics reflected new trends among labor organizations and/or activists’ online activities. First, with the growth of a new generation of migrant workers, labor communities have become more and more concerned with labor culture and urban inclusion. Grassroots labor organizations and activists have attempted to showcase the living conditions and social situation of migrant workers and have indirectly shaped workers’ class consciousness. This implies that the new generation of migrant workers is more likely to be consciously aware of their own situations. To a large degree, labor organizations and individual activists have deliberately chosen advocate strategies that align with the general conditions of the represented workers. Therefore, drawing insights from studies on the new generation of migrant workers can advance our understanding of the online activities of labor organizations and activists. Second, Weibo provides a platform of online networking among labor organizations, individual activists, and labor researchers. This type of online interaction not only provides researchers with a window for learning about updated labor issues, but also enables labor activists and grassroots organizations to learn about academic theories and perspectives, which might become the theoretical tools for consciously reflecting on their own situations through an institutional and structural perspective. This identifiable topic of institution and labor rights is a case in point. Nevertheless, the Labor Institution Concerned Community only possesses a moderate level of online influence, and thus their impact on China’s labor movements needs to be scrutinized in the future. These findings also show that exploratory analyses of social media texts using big data analytics are helpful for identifying new topics and trends in an inductive manner (DiMaggio et al., 2013). This is of especial importance in a rapidly changing new media ecology.
Community detection of the online interaction network, in terms of direct mention, between Weibo users identifies five major labor concerned communities. Analyses of topic prevalence at the community level shed light on the cultural identity of each community. Based on the resultant topic prevalence, this study labeled the five communities as the Labor Homeland Community, Labor Rights Protection Community, Labor Culture Community, Labor Institution Concerned Community, and Labor Interest Concerned Community. Analysis of the interaction patterns between communities finds that frequent interactions exist between the Labor Homeland Community and Labor Culture Community because of shared offline organizational affiliations of key members of the two communities, and the cultural compatibility of the two communities’ expressed topics. Specifically, the Labor Homeland Community pays attention to the relations between the new generation workers and the urban circumstances, while the Labor Culture Community aims to present the daily lives of workers through the means of songs and performances. Both communities care deeply about the realities of migrant workers’ urban lives. Moreover, frequent cross-community interactions exist between the Labor Institution Concerned Community and the Labor Rights Protection Community, which probably results from the fact that a conception of rights protection has gradually become the foundation of the labor movement. It is also worth mentioning that interactions between the Labor Rights Protection Community and the Labor Interest Concerned Community are not quite common, which implies that labor organizations and labor rights protection actions are probably not closely related. There are two possible explanations. First, in addition to rights protection actions, the Labor Interest Concerned Community also cares about topics such as urban inclusion and public welfare, and their inclusive understanding of workers’ rights and interests might influence their online action strategies. Second, the Labor Interest Concerned Community is also concerned with the survival and development of labor organizations, and the limited political opportunities for grassroots NGOs may lead to the avoidance of organizing or participating in labor movements. These findings once again show that the compatibility of communities’ cultural identities strongly influences their online interactions. Nevertheless, as this study has not quantified the statistical relations between similarity of cultural identities and online interactions, this argument invites further empirical investigation.
Average topic prevalence for the five major labor concerned communities.
The conception of the Weibo platform as a social space for interactions and expressions implies that scholars need to simultaneously investigate social networks formed in online interactions and the communicated content or meanings embedded in those interactions. This approach is valuable in studying online networking in a particular domain (the labor issues in this study) in a holistic manner. As the Chinese social media landscape has become increasingly diversified and polarized (deLisle et al., 2016), exploration of the holistic picture is also becoming increasingly important.
This study analyzed the mutual constitution of online interaction communities and their identification with salient topics from the perspectives of network field and cultural identity. The findings also shed light on the evaluation of the topical validity of the topic modeling results and the interpretability of the detected communities. When the community structure is proved helpful for interpreting topics, and the topics from topic modeling can assist in making sense of the communities as meaningful network fields, the research findings should have a high level of validity, and vice versa. Moreover, the conception of social media as a social space provides a theoretical perspective for the empirical investigation into social media texts. Investigation into online community structure, cultural identity, and mutually constitutive relations is a potential avenue for big data analytics of social media texts to advance sociological insights (see Evans and Aceves, 2016 for a similar argument). With the accumulation of an enormous amount of textual data and the development of natural language processing techniques, sociologists will have new tools for measuring cultural environments at both the meso and the macro levels, as well as different types of cultures (Bail, 2014), and therefore will be able to effectively investigate the dynamic relations between different cultures and their embedded macro environments. Taking this study as an example, it is reasonable to regard online communities as meso-level cultural contexts and find that actors in different contexts have distinctive views of the same issue domain such as labor rights, while the same actors discuss different topics or use different frames in different contexts.
The theoretical insight regarding the mutual constitution of network fields and cultural identities also sheds new light on the relations between online and offline space. Traditional perspectives tend to regard online space as an extension of offline space and are skeptical about the validity of online data and the attendant distortion effect on causal inference. However, when repositioning this debate through the theoretical perspective of the mutual constitution of network fields and cultural identities, it can be shown that the content presented by Weibo users is contingent upon the network fields in which online communication processes are embedded, and the communicated contents of the same users also changed across different network fields. Therefore, there might exist no such thing as the only ‘true’ or ‘valid’ social fact. Once Weibo users’ self-presented information spreads through online interactions, such information consequently impacts online networking patterns among users and constitutes a new and relatively independent network field. The accumulative changes of network fields might even have the potential to bring changes to the offline society. In this regard, how the communicated texts and symbols result in the emergence, persistence and decline of network fields, and how different network fields in turn impact the communication process between actors should be seen as a central theme of social media studies. This theoretical perspective is especially suitable for studies of Internet culture and online social mentality, among others.
This study preliminarily demonstrates the value of applying topic modeling of textual big data as a new research tool. It is worth mentioning that the integration of topic modeling and social network analysis techniques (e.g., community detection) enables sociologists to empirically analyze the link between macro social structures and the micro process. For instance, at the micro level, actors express their diverse concerns about labor issues by posting Weibo tweets, and the individual competitive frames can be aggregated to reveal the dominant frames at a group level, which can be interpreted as a manifestation of a macro structure. Here, topic modeling provides a powerful tool for investigating the linkage between individual (micro) level expression and community (macro) level structure of topics or frames. Furthermore, users preferring differentiated labor-related frames are in competition or alliance with each other, which in turn facilitates or hinders their online networking, and the resultant networks provide network fields for further framing the labor issue. In sum, integration of network analysis techniques and topic modeling enables scholars to analyze the dynamic mutual constitution of online networking and the communicated content, as well as the resultant changes in cross-community communication.
This research has some limitations. First, the analyzed textual corpus in this study is to some extent determined by the seed users. Future studies might need to evaluate the effects of seed users on findings by changing or including more seed users. Second, this study only indirectly evaluated the validity of the topic discovered by topic modeling, and future studies might evaluate the topic coherence by comparing the result of topic modeling and that of supervised machine learning methods or manual topic assignments.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This study is partially supported by the Philosophical and Social Sciences Foundation of Shanghai Municipality (2018BSH003).
