Sage Journals: Discover world-class research

Abstract

Human trafficking, a grave human rights violation with far-reaching global consequences, serves as a compelling case study for analyzing multifaceted polarization dynamics in online discourse and the influence of social media on public perceptions and responses. Drawing on social identity theory and self-categorization theory, this article aims to elucidate both group polarization and opinion polarization surrounding human trafficking on social media. Through an integrated approach that combines clustering, social network analysis, text mining, and topic modeling, this study provides a comprehensive examination of community formation, influential actor identification, topic classification, and semantic analysis. The similarity between user-generated content from clustered groups and the topics identified is calculated to quantify the degree of multifaceted polarization. The findings reveal a robust community structure within the network and uncover divisions and structural characteristics across each subgroup. Utilizing the BERTopic model, thematic clusters such as vulnerable groups, persecution experiences, incident areas, law and politics, public awareness, contraband, and case events are identified, reflecting the primary public concerns regarding human trafficking. This research enhances our understanding of multifaceted polarization shaped by social identity in digital conversations about critical social issues and holds significant implications for policymakers, advocacy groups, and practitioners navigating public opinion regarding human trafficking in the digital realm.

Keywords

polarization BERTopic public opinion clustering human trafficking

Introduction

Human trafficking is a modern-day manifestation of enslavement, involving the coercive, deceptive, or forcible exploitation of individuals for purposes such as forced labor, sexual exploitation, or financial gain (Van Buren et al., 2019). As a transnational criminal enterprise, it imposes profound and multifaceted consequences on societies, economies, and global governance systems. Economically, trafficking distorts labor markets, depresses wages, and infiltrates supply chains, exacerbating inequality and undermining sustainable development initiatives (Zimmerman et al., 2011). In addition, the severe psychological trauma and health risks endured by victims constitute pressing public health challenges (Mak et al., 2023). Moreover, it is inextricably linked to gender-based violence and systemic discrimination, disproportionately affecting women and children.

The academic discourse on human trafficking is broad and highly interdisciplinary, encompassing criminology, public health, gender studies, and information systems. Prior studies have addressed a range of focal areas, including sexual violence and exploitation (Kejriwal & Szekely, 2022), psychosocial and public health interventions (Mak et al., 2023; Such et al., 2024), public awareness and prevention strategies (Savoia et al., 2023; Zimmerman et al., 2011), gender-sensitive frameworks and social equity goals (Cho, 2013), and legal and regulatory frameworks (Simmons et al., 2018). Research has also examined structural vulnerabilities, government responsiveness, and network-based models of trafficking relationships (Ali et al., 2021; Malik et al., 2018).

The rise of social media platforms—such as Facebook, Instagram, and X—has profoundly reshaped how trafficking is discussed, understood, and politicized (Cao & Yu, 2019). These platforms facilitate large-scale user-generated discourse, enabling high levels of interconnectivity (Lane et al., 2023). Human trafficking-related content online spans themes such as gender vulnerability, labor rights, international collaboration, victim experiences, and human rights advocacy (Moran & Prochaska, 2023). However, online discourse is increasingly fragmented by algorithmic curation, identity politics, and ideological biases, distorting public understanding and reinforcing oppositional narratives (Guess et al., 2023a; Xing et al., 2022).

Recent scholarship has increasingly focused on public discourse surrounding human trafficking, particularly in digital spaces (Keighley & Sanders, 2023; Wang, 2023). Studies have examined sex trafficking (Pfeffer et al., 2024), online child exploitation (Kranrattanasuit, 2024), and migrant victimization linked to deceptive employment promises (Stollwerck et al., 2024). In addition, researchers have assessed the design and targeting of awareness campaigns (Savoia et al., 2023) and explored the evolving concerns of younger demographics regarding trafficking and sexual harassment (Ardabili et al., 2024). One study also emphasized the critical needs of survivors within healthcare, criminal justice, and social services (Preble et al., 2022). Collectively, these studies reveal a highly fragmented landscape of public attention and institutional responses.

Public discourse on trafficking is often shaped by structurally distinct communities and ideologically entrenched groups (Boler et al., 2024). Polarization in this context refers to the divergence of public opinion into distinct and often conflicting factions, which leads to reduced mutual understanding and limited engagement across ideological boundaries. It can manifest in unipolar (e.g. ideological extremity), bipolar (e.g. binary opposition), or multipolar (e.g. competing ideological clusters) forms (Wakefield & Wakefield, 2023). In multipolar polarization, distinct ideological factions coexist and compete, resulting in a fragmented discursive landscape (Phillips & Carley, 2024).

Polarization has been extensively examined in domains such as political partisanship and misinformation (Moran & Prochaska, 2023). However, its impact on complex humanitarian crises—such as human trafficking—remains insufficiently explored. Trafficking, which intersects gender, labor, migration, human rights, and criminal justice, is particularly vulnerable to discursive segmentation. On social media, user communities gravitate toward frames that align with their values or advocacy goals, resulting in cognitive and thematic silos (Lee et al., 2014). These parallel discourses coexist in relative isolation, limiting opportunities for consensus and shared action.

In this context, polarization becomes more than a social media phenomenon—it constitutes a structural barrier to collective action. Advocacy groups, law enforcement, legal scholars, public health officials, and survivor communities often emphasize divergent aspects of trafficking—such as criminal justice, harm reduction, labor rights, or immigration control. These competing frames reinforce thematic silos, hindering interdisciplinary efforts and undermining systemic responses. Understanding how such discursive divides emerge and persist is crucial for improving policy coordination, resource allocation, and stakeholder collaboration.

In this study, we propose the concept of “multifaceted polarization,” which captures the simultaneous presence of structural fragmentation (i.e. the formation of distinct user communities) and semantic fragmentation (i.e. the identification of various topics) within online discussions. Grounded in social identity theory and self-categorization theory, this concept synthesizes two interrelated dimensions:

Group polarization: the tendency of individuals to self-organize into ideologically or affectively aligned communities, reinforced by in-group interactions and shared identity cues;

Opinion polarization: the emergence of divergent narratives around a common issue, as distinct communities interpret and prioritize the topic through different cognitive, normative, or affective lenses.

Unlike prior approaches that treat ideological and affective divides in isolation (Menezes et al., 2024; Renstrom et al., 2023), our approach foregrounds the interplay between network structure and discursive content, offering a more integrated analytical lens. Accordingly, we pose the following three research questions:

RQ1. How can social identity and self-categorization theories be used to construct a framework for analyzing multifaceted polarization?

RQ2. How can group and opinion polarization be synthesized to investigate multifaceted polarization in human trafficking discourse?

RQ3. How does multifaceted polarization in human trafficking discourse inform public engagement patterns and guide targeted interventions?

Methodologically, we integrate clustering, social network analysis (SNA), text mining, and topic modeling to detect user communities and discourse themes on the social media platform X (formerly Twitter). To assess multifaceted polarization, we analyze the degree of alignment between structural clusters and thematic segmentation using similarity analysis, revealing the contours of both structural and semantic polarization. The findings provide theoretical contributions to polarization research as well as practical implications for promoting more unified and effective anti-trafficking efforts.

Literature review

Online polarization in social media

The rise of social media platforms has fundamentally reshaped the landscape of public discourse, empowering users to generate, disseminate, and interpret content at an unprecedented scale (McNally & Bastos, 2025; Xing & Zhang, 2025). These platforms facilitate community formation by connecting individuals with shared interests or beliefs, thereby reinforcing social cohesion and enabling the expression of group identities (Guess et al., 2023b; Khawar & Boukes, 2024). However, the same affordances that foster collective engagement may simultaneously exacerbate polarization (Bail et al., 2018). Algorithmically curated content, selective exposure, and identity-driven interactions contribute to fragmented information environments and diminished exposure to dissenting views (Powell, 2024; Robertson et al., 2023).

Polarization in digital spaces is broadly characterized by the intensification of attitudinal, ideological, or interpretive divisions between groups, manifesting in both network structures and discursive content (Lee et al., 2014). Scholars have examined this phenomenon from diverse disciplinary perspectives, analyzing factors such as algorithmic bias (Guess et al., 2023a), cognitive heuristics (Buder et al., 2023), and the role of feedback mechanisms—such as likes, shares, and social endorsements—in reinforcing entrenched viewpoints. For example, Messing and Westwood (2012) demonstrated that prominent social endorsements increase users’ likelihood of selecting particular content while simultaneously reducing partisan-selective exposure, thus complicating assumptions about strictly ideological behavior. Complementing this, Nyhan et al. (2023) found that reducing exposure to like-minded sources can enhance openness to diverse perspectives and decrease impolite discourse, even if it does not significantly alter core attitudes. These findings suggest that exposure patterns and social signals interact in complex ways to shape online polarization.

Current studies further reveal that social media-induced polarization is a multifaceted phenomenon characterized by the exacerbation of ideological divisions and the reinforcement of pre-existing beliefs among digital users (Lee et al., 2014). Recent research has expanded the analytical lens to include affective polarization and ideological extremity (Risius et al., 2024), political divisiveness (Masullo, 2023), and even depolarization interventions (Combs et al., 2023), reflecting a growing concern with both the causes and mitigation of polarization in digital environments.

In synthesizing this broad literature, two influential lines of inquiry have emerged. The first focuses on group-level structural dynamics, examining how individuals cluster with like-minded others and form ideologically or emotionally cohesive communities—commonly termed group polarization (Wuestenenk et al., 2025). This research stream investigates the formation of homophilic clusters, the roles of influential users in shaping group norms, and the feedback mechanisms that reinforce alignment within communities over time (Haq & Kwok, 2024; Marchetti et al., 2024).

The second line of research examines content-level divergence, analyzing how communities engaged with the same issue construct increasingly distinct narratives (Overgaard, 2024). This phenomenon, variably termed opinion polarization, topic polarization, or semantic polarization, highlights the fragmentation of interpretive frames, emotional tones, and normative reasoning across groups (Masullo, 2023; Renstrom et al., 2023). Computational methods such as topic modeling, sentiment analysis, and moral framing analysis are widely employed to detect these divergences.

Although these two strands provide complementary insights, they are often examined in isolation. Recent studies increasingly emphasize their interdependence, recognizing that the structural formation of ideologically coherent communities and the semantic divergence of discourse frequently unfold simultaneously (Cinelli et al., 2024; Yang et al., 2021). This integrated perspective conceptualizes polarization as a multifaceted process, encompassing both the social dynamics of group formation and the fragmentation of public discourse. Such a framework advances a more nuanced understanding of how polarization emerges and stabilizes within complex online ecosystems, with significant implications for public opinion, collective decision-making, and the governance of digital communication.

Social identity and self-categorization

Online discourse on human trafficking illustrates how social identities and categorizations shape ideological divides and opinion formation in digital environments. Social identity theory and self-categorization theory provide foundational frameworks for understanding these dynamics (Moran & Prochaska, 2023), clarifying how group affiliations influence individual perceptions, behaviors, and the emergence of polarization.

Social identity theory, introduced by Tajfel and Turner (1979), explains how individuals construct their self-concept based on their affiliations with social groups. The theory posits that people categorize themselves into distinct social groups—based on factors such as race, ethnicity, religion, or shared interests—and integrate these memberships into their overall identity (Li, 2022; Velasquez & Montgomery, 2020). This process establishes psychological boundaries between “in-groups” and “out-groups,” shaping how individuals perceive themselves and others.

Recent studies have illuminated the application of social identity theory in social media research, demonstrating its explanatory power across diverse phenomena. For example, it has been employed to analyze behavioral intentions, social sorting, and the role of identity in influencing user engagement and platform participation (Lane et al., 2023; Qian & Seifried, 2023). Furthermore, research on intergroup dynamics in digital contexts has shown how social identity processes enhance group cohesion while simultaneously fueling intergroup conflict (Burgers et al., 2023). In addition, Ashforth and Mael (1989) emphasized that social identity theory explains how individuals develop a strong sense of belonging to their in-group, often accompanied by preferential treatment of in-group members and antagonism toward out-groups—processes that intensify polarization (Dutot, 2020).

Self-categorization theory, introduced by Hogg and Turner (1987), extends social identity theory by detailing how individuals categorize themselves across varying levels of abstraction—from specific personal identities to broader social identities. This theory emphasizes the situational prominence of particular identities and explains how such prominence shapes cognition and behavior within specific contexts (Koschate et al., 2021). In social media environments, users often conform to the norms and values of their online communities, as deviation may provoke ideological conflict and social sanctions.

Empirical studies applying self-categorization theory have examined phenomena such as information sharing, impression formation, evaluation of social identities, and adherence to social norms in influencer marketing contexts (Qian & Seifried, 2023). These findings demonstrate how salient self-categorizations foster alignment with group norms and strengthen collective identity, thereby influencing how individuals interpret and disseminate information online (Koschate et al., 2021).

Group polarization describes the tendency for group discussions to shift members’ attitudes further toward the group’s prevailing viewpoint. Social identity theory explains this phenomenon as individuals’ motivation to maintain and enhance a positive image of their in-group, reinforcing adherence to group norms and amplifying distinctions between in-group and out-group attitudes (Fritsche et al., 2018; Wakefield & Wakefield, 2023). In contrast, opinion polarization refers to the increasing divergence of perspectives between groups. Self-categorization theory clarifies this process by showing how individuals adopt the dominant narratives of their in-group while dismissing or rejecting out-group views (Flanagin et al., 2014).

In online discussions about human trafficking, identity-driven groups interpret the issue through distinct thematic frames that align with their collective narratives. For example, law enforcement communities predominantly frame trafficking as a criminal justice challenge, advocacy groups emphasize its human rights dimensions, and political factions situate it within broader ideological debates (Tsai & Bagozzi, 2014). Individuals selectively expose themselves to information that resonates with their in-group’s framing, thereby reinforcing both group cohesion and ideological divisions (Chen et al., 2021). Consequently, group polarization (the intensification of attitudes within groups) and opinion polarization (the widening of differences between groups) emerge as interrelated phenomena shaped by the interplay of social identity and self-categorization processes.

Social identity and self-categorization theories account for how in-group affiliation and identity salience shape online polarization. In-group favoritism and norm conformity strengthen internal consensus (group polarization), while intergroup differentiation exacerbates ideological divides (opinion polarization). These mechanisms demonstrate how identity-driven dynamics structure digital discourse, offering insights into managing polarization and fostering more integrative, cross-group communication in online environments.

Methodology

Data collection and processing

Rebranded as “X” in 2023, Twitter functions as an online social media and networking platform that facilitates user connectivity, communication, and interaction. With features such as likes, retweets, and replies, X promotes user engagement, fostering seamless content interaction and amplification. Consequently, it has become a crucial source for news updates, coverage of significant events, and discussions on trending topics, establishing itself as a pivotal element in contemporary information consumption. Notably, a substantial number of users on X engage with the issue of human trafficking, with thousands of daily tweets dedicated to this topic.

A web crawler was employed to extract data from X using the keyword “human trafficking” via the X API. The X API provides a programmatic gateway for obtaining essential components related to human trafficking, including tweets, retweets, timestamps, text content, and user profiles, while ensuring user anonymity. Data was collected from 1 January 2025 to 7 January 2025.

During this period, we collected user data, which included user IDs, tweets, retweets, comments, interaction dates, and profile information such as occupation, city, place of residence, links to social media profiles, and account creation date. To maintain data integrity, we excluded comments from users with incomplete profile information. In addition, we conducted an additional level of scrutiny on user profiles to identify and eliminate bot accounts, thereby enhancing the overall accuracy of our research findings. It is important to emphasize that the examination of user profiles was conducted solely for bot detection purposes and not for any third-party use.

In the data processing phase, we employed Excel functions to remove invalid characters, including hyperlinks, null values, emoticons, and punctuation marks (e.g. $, &, [], and similar symbols). We also conducted a manual review to ensure spelling accuracy. Following this cleaning process, we retained a total of 8138 tweets for further analysis.

Modeling multifaceted polarization

The multifaceted polarization framework (Figure 1) applies social identity theory and self-categorization theory to uncover how group dynamics and ideological divides shape online discourse on human trafficking. This framework aims to clarifies: (1) how multifaceted polarization is driven by social identity theory and self-categorization processes, (2) how it can be quantified by combining group polarization and opinion polarization, and (3) how the empirical results align with and contribute to these theoretical perspectives. It evaluates two dimensions:

Group polarization, encompassing group clustering, social network analysis, and influential users identification.

Opinion polarization, involving thematic segmentation, topic distribution, and keywords identification.

The mechanism of multifaceted polarization is examined through a two-step analytical process. First, the overlap between group structures and thematic divisions is quantified using similarity metrics. Second, the extent to which each group’s discourse concentrates on specific topics is assessed using the Herfindahl Topic Concentration Index (HTCI). This approach not only demonstrates the applicability of the theoretical frameworks but also enhances our understanding of their explanatory power in the context of online discourse polarization.

Figure 1.

Theoretical framework and analytical process of multifaceted polarization.

Group clustering

This study utilizes NodeXL, an integrated SNA tool in Microsoft Excel, to cluster users into diverse groups based on their interactions in online discussions related to human trafficking. We examine network structures using NodeXL’s Clauset-Newman-Moore algorithm (Clauset et al., 2004), which efficiently identifies communities within extensive networks by initially treating each vertex as an individual cluster and then merging them pairwise. Following the clustering process, the Harel-Koren Fast Multiscale layout algorithm is employed to generate a visual representation of the graph. In addition, the Fruchterman-Reingold algorithm, a force-directed layout method, is used to arrange the communities.

In the clustering graph, vertices represent X users, while edges capture user interactions such as retweets, replies, and mentions, which reflect engagement between users (Colleoni et al., 2014). The network’s size, connectivity, and cohesiveness are assessed through graph density (GD) and average path length (APL). In addition, the prestige, prominence, importance, and authority of users are evaluated through degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and PageRank.

Topic identification

Furthermore, we employ BERTopic, an advanced topic modeling technique, to uncover online public opinion on human trafficking across X. The BERTopic model utilizes BERT (Bidirectional Encoder Representations from Transformers) embeddings and class-based TF-IDF to generate concise and meaningful topic clusters (Xing et al., 2024). Traditional topic modeling approaches, such as Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF), face challenges in analyzing short and noisy texts (Menezes et al., 2024). These methods rely on word co-occurrence patterns and bag-of-words representations, which often fail to capture contextual nuances, leading to less coherent and more overlapping topics in datasets with sparse textual content (Celikten & Onan, 2025). In our previous experiments, BERTopic consistently demonstrated superior performance compared to LDA and NMF, producing more interpretable and well-separated topics. Therefore, BERTopic was adopted for topic clustering in this study. The workflow is illustrated as follows:

Document embedding

Within the BERTopic framework, we transform documents into vector representations within a semantic space, based on the assumption that documents sharing the same topic exhibit significant semantic similarity. The embedding process underlying BERTopic relies on the Sentence-BERT (SBERT) framework, which converts sentences and paragraphs into dense vector representations using pre-trained language models. This approach has demonstrated robust performance across various sentence-embedding tasks.

Documents clustering

As data dimensionality increases, the distinction between proximity to the nearest and farthest data points diminishes. We employ UMAP to align language models operating in varying dimensional spaces, thereby reducing the dimensionality of document embeddings. These reduced embeddings are then clustered using the hierarchical density-based spatial clustering of applications with noise (HDBSCAN), an extension of density-based spatial clustering of applications with noise (DBSCAN) that supports clustering in datasets with varying density structures. HDBSCAN accommodates noise as outliers through a soft-clustering process, reducing the likelihood of misassigning unrelated documents to clusters and thereby enhancing topic representation quality.

Topic representation

The introduction of a modified Term Frequency-Inverse Document Frequency (TF-IDF) approach enables topic identification based on word distribution within document clusters, assessing word significance within specific topics (Ao et al., 2023). Building upon the core principles of TF-IDF, documents within a cluster are treated as a cohesive unit through concatenation, leading to the creation of a class-based TF-IDF (c-TF-IDF) for topic description.

W_{x, c} = ‖ t f_{x, c} ‖ * l o g (1 + \frac{A}{f_{x}})

(1)

$t f_{x, c}$ indicates the frequency of word $x$ in class $c$ ; $f_{x}$ represents the frequency of word $x$ across all classes; and $A$ represents the average number of words per class.

Similarity/dissimilarity

To uncover the mechanisms underlying multifaceted polarization in human trafficking discourse—by integrating both group polarization and opinion polarization—we calculate textual similarity between user-generated content from clustered groups and the topics identified by the BERTopic model. We represent the textual content tweeted by individuals in each clustered group as TF-IDF vectors, where each term’s weight reflects its significance within the respective discourse. These vectors are then compared with keywords in thematic categories, since the topics are represented through structured keyword rankings generated by the c-TF-IDF algorithm, which assigns greater weight to words that differentiate specific topics.

Subsequently, cosine similarity is calculated to quantify the degree of alignment between each group’s discourse and the identified topics. As a widely recognized metric, cosine similarity is widely used in information retrieval, natural language processing, and social sciences (Agrawal et al., 2021; Kasban & Nassar, 2020). The similarity scores range from 0 (indicating complete dissimilarity) to 1 (indicating identical content), allowing us to systematically evaluate textual similarity and thematic alignment in the discourse on human trafficking.

Multifaceted polarization

The HTCI is an adaptation of the Herfindahl-Hirschman Index (HHI) (Song et al., 2013), applied to quantify how narrowly a group’s discussion focuses on specific topics. For a given group G with similarity scores across n topics, the HTCI is calculated as:

H T C I = \sum_{i = 1}^{n} {(\frac{s_{i}}{S})}^{2}

(2)

where $s_{i}$ represents the similarity score of topic i for the group, and S denotes the total similarity across all topics ( $\sum_{i = 1}^{n} s_{i}$ ). The HTCI provides a continuous measure of online polarization, ranging from 0 (uniform topic distribution, indicating maximal diversity and minimal polarization) to 1 (complete focus on a single topic, signifying extreme polarization).

Finally, this study quantifies the Weighted Mean Polarization Index (WMPI) of multifaceted polarization by calculating the size-weighted average of group HTCI scores to overall polarization intensity across the entire dataset:

WMPI = \frac{\sum_{g = 1}^{N} ({HTCI}_{g} * {Size}_{g})}{\sum_{g = 1}^{N} {Size}_{g}}

(3)

Results

Clustering network and group polarization

The clustered network depicted in Figure 2 is a directed graph comprising 3846 vertices and 4150 edges, indicating substantial connectivity. This network contains 356 distinct groups, each representing a unique community structure. The interaction dynamics within this network are multifaceted, with 2398 retweets, 315 original tweets, 435 replies, and 99 mentions, reflecting a broad spectrum of communicative activities. In addition, there are 395 self-loops, indicating some level of self-referential behavior. The reciprocated vertex pair ratio is relatively low at 0.0084, indicating limited mutual engagement, while the reciprocated edge ratio is 0.0166. The network consists of 507 connected components, including 125 single-vertex components, revealing the presence of isolated users or sub-networks.

Figure 2.

Clustering network of group polarization on human trafficking.

The APL of 4.9467 indicates a moderately connected network structure, suggesting that information can propagate through the network with reasonable efficiency despite its size. The GD of 0.0004 reveals a sparsely connected network. In addition, the high modularity score of 0.8478 signifies a well-defined community structure, where nodes within the same cluster exhibit significantly higher interconnectivity compared to nodes in different clusters.

Table 1 summarizes structural metrics for the top 10 clusters based on modularity-driven community detection. G1, the largest cluster (223 users), shows low GD (0.004) but a short APL (1.98), suggesting a centralized network with efficient information flow. This structure reflects the presence of key influencers who drive widespread dissemination.

Table 1.

Statistics in the Top 10 Clusters.

Group	Vertices	Total edges	Self-loops	Graph density	Average path length
G1	223	223	1	0.004	1.982
G2	162	275	18	0.008	2.776
G3	142	142	1	0.007	2.053
G4	104	205	5	0.019	1.947
G5	94	94	1	0.011	2.040
G6	91	98	1	0.011	2.160
G7	85	90	5	0.008	2.021
G8	84	137	15	0.017	4.582
G9	56	65	1	0.020	3.224
G10	51	54	1	0.021	2.957

In contrast, G2 demonstrates a high number of self-loops (n = 18) and the longest APL (2.78), indicating insular discourse with limited external connectivity—features consistent with the dynamics of echo chambers.

Clusters such as G4 and G8 display higher GD (0.019 and 0.017, respectively), yet diverge in APL. G4 (APL = 1.95) reflects tightly interconnected users, possibly enabling rapid coordination, whereas G8 (APL = 4.58) may contain dispersed or weakly linked subgroups which may impede efficient exchange.

Other clusters (e.g. G9, G10) exhibit dense but compact structures, marked by repetitive internal interactions and limited structural reach. These variations highlight differences in group cohesion, communication efficiency, and potential influence across the broader network.

Table 2 presents the top 15 users ranked by betweenness centrality—a key indicator of brokerage and gatekeeping in information flow. The vertex labels indicating user IDs have been normalized to anonymize individual accounts while preserving structural integrity. User1 consistently ranks highest across multiple centrality metrics, including in-degree (226), closeness (0.071), eigenvector centrality (0.718), and PageRank (0.012), positioning this account as a dominant opinion leader. Its exceptionally high betweenness score (278,454) suggests a unique role in linking otherwise disconnected communities, thereby facilitating the spread of polarized content across network clusters.

Table 2.

Top Influential Users Based on Betweenness Centrality.

User2 and User3 also occupy structurally influential positions. Although User2 has a moderate in-degree (87), its high betweenness (189,787) and closeness (0.061) indicate extensive reach across diverse subgroups. User3, despite having no inbound ties, ranks high in closeness (0.065) and eigenvector centrality (0.047), implying influence through proximity to prominent users rather than through direct popularity.

Accounts such as User12, User14, and User15 appear structurally peripheral, each with only one or two connections. However, their disproportionately high betweenness scores reveal their function as strategic bridges between otherwise unconnected regions of the network, amplifying their structural importance despite minimal direct interaction.

From an influence diffusion perspective, users like User6, User9, and User13 exhibit high in-degree and PageRank values, reflecting their roles as attention magnets and local information hubs. For example, User9 (in-degree = 114; PageRank = 0.006) may exert concentrated influence within a specific ideological cluster, even if its influence across the broader network is comparatively constrained.

Topic identification and opinion polarization

The BERTopic algorithm identifies 29 topics within our dataset. Figure 3 illustrates the distribution of these topics related to human trafficking on X. Using dimensionality reduction techniques, the spatial distribution chart visualizes topics and documents in a two-dimensional space (see Figure 3). The distance between nodes reflects their semantic similarity, while the colors of the points indicate their subject categories. Outlier topics within primary topics are infrequent, as most documents closely align with the central topic cluster. This indicates that discussions related to human trafficking predominantly focus on the core semantic domain of the primary topic cluster, demonstrating a high level of semantic concentration.

Figure 3.

Distribution of integrated topics on human trafficking.

To ensure reliable theme identification, two experts in deep learning and sociology conducted a systematic, computer-assisted content analysis of topic labels and top-ranked keywords, resulting in the inductive categorization of seven coherent themes: “vulnerable groups,” “persecution experience,” “incident area,” “law and politics,” “public consciousness,” “contraband,” and “case events.”

Table 3 presents the top five keywords ranked by c-TF-IDF scores in each topic. The specific c-TF-IDF values of each keyword are provided in Appendix 1. The theme “vulnerable groups” comprises topics 0, 9, 17, and 18, with representative keywords including “children, women, parents, girls, kids,” and so on. The theme “persecution experience” encompasses topics 4, 6, 13, and 27, featuring prominent keywords such as “pedophilia, rape, harvesting, slavery, endworkness,” and so on. The theme “incident area” is composed of topics 1, 7, 8, and 14, primarily containing location-related keywords like “Israel, Ukraine, Rwanda, Africa, American,” and so on. The theme “law and politics” includes topics 5, 22, 23, and 26, focusing on terms such as “crimes, republicans, federal, illegals, court,” and so on. The theme “public consciousness” involves topics 3, 10, 11, and 19, containing keywords such as “community, awareness, WalkForFreedom, preventing, disgraceful, extremism,” and so on. The theme “contraband” consists of topics 2, 16, 21, and 25, emphasizing “border, drug, earninday, combat, smuggling, migrant,” and so on. Finally, the theme “case events” is composed of topics 12, 15, 20, 24, and 28, represented by keywords such as “Jadi, POTUS, Biden, Joe,” and so on. The identified topics achieve a topic coherence score of 0.236 and a topic diversity score of 0.864, indicating robust model performance and the emergence of semantically distinct and well-structured topics.

Table 3.

Keywords and Distributed Topics in the Themes.

Themes	Topics	Top keywords (ranking by c-TF-IDF score)	Proportion
Vulnerable groups	0	Children, child, women, parents, families	6.98%
	9	Women, girls, rights, people, humantrafficking	4.49%
	17	Ohio, teacher, kids, victims, vulnerable	3.11%
	18	Tate, andrew, brother, slaves, survivors	2.98%
Persecution experience	4	Porn, pedophilia, rape, harvesting, sextrafficking	4.95%
	6	Slavery, endworkness, salve, remorse, satanism	4.88%
	13	Podcast, suffering, tour, torture, trauma	3.30%
	27	Interview, trafficking, abuse, pedophilia, organ	0.84%
Incident area	1	Israel, israeli, Hamas, boundaries, America	6.37%
	7	Ukraine, bio, labs, Kenya, Rwanda	4.62%
	8	Organ, harvesting, China, Arizona, Ukraine	4.50%
	14	Rwanda, refugees, African, headmen, NYC	3.29%
Law and politics	5	Charges, crime, law, commissioner, unvetted	4.90%
	22	Power, executive, ballot, police, operating	1.96%
	23	Republications, voted, Gaetz, federal, illegals	1.89%
	26	Challenges, supreme, court, trade, empire	0.96%
Public consciousness	3	Claytonconway65, elonmusk, Raph, accountability, contract	5.54%
	10	Community, awareness, join, WalkForFreedom, preventing	3.45%
	11	GregAbbott, TX, Texas, disgraceful, truth	3.40%
	19	Fentanyl, Jordan, Denomi, extremism, strength	2.97%
Contraband	2	Border, cartels, drug, videos, indoctrination	5.85%
	16	EarnInDay, EarnIn, combat, drug, operation	3.13%
	21	Smuggling, migrant, trafficking, money, fraud	2.52%
	25	Arrested, biolabs, death, slaves, import	1.24%
Case events	12	Jadi, NG, korban, Tristan, MikeGil	3.40%
	15	POTUS, Trump, Donald, Travis, Diana	3.22%
	20	Biden, cartels, border, boundaries, Hillary	2.67%
	24	Peru, 43, syndicate, YouTube, released	1.75%
	28	Biden, Joe, oversees, Hollywood, accused	0.84%

Similarity analysis and multifaceted polarization

Figure 4 presents the outcomes of the similarity analysis between clustered groups and identified topics. The findings reveal that individuals within the same cluster tend to exhibit stronger alignment with their most semantically relevant topics. In addition, the results indicate that different groups focus intensely on specific thematic areas while paying comparatively less attention to others. For instance, users in clusters G228 to G232 exhibit a pronounced focus on Topic 1, characterized by keywords such as “America, Israel, boundaries,” which aligns with the “incident area” theme. Similarly, individuals in clusters G157 to G161 demonstrate particular attention to Topic 0 and Topic 9, both of which relate to “vulnerable groups,” as evidenced by keywords such as “children, women, people,” and “parents.”

Figure 4.

Similarities between clustered groups and identified topics.

The identified topics are further synthesized to correspond with key themes related to human trafficking within each clustered group. For example, Groups 8, 16, 93, 124, and 269 predominantly address issues concerning vulnerable populations, whereas Groups 4, 21, 53, 174, and 287 focus on experiences of persecution. Groups 44, 68, 249, 251, and 301 emphasize the geographical distribution of trafficking incidents, while Groups 6, 28, 39, 273, and 345 engage in discussions centered on the legal and political dimensions of human trafficking. Conversely, Groups 2, 83, 125, 318, and 321 prioritize public awareness and consciousness, whereas Groups 7, 46, 75, 164, and 193 focus on contraband and illicit trade. Finally, Groups 24, 33, 157, 183, and 253 primarily discuss specific human trafficking cases.

Table 4 highlights a strong inverse relationship between group size and thematic coherence, as measured by the HTCI. The WMPI of 0.75 indicates a moderate yet notable level of polarization across the dataset. Larger groups tend to exhibit lower topic concentration and greater internal variability, as reflected in their higher standard deviations. A robust negative correlation is observed between group size and topic concentration (Pearson’s r = −.72, p < .001), suggesting that as group size increases, thematic alignment decreases. Specifically, for every 10-user increase in group size, HTCI declines by 0.05 (SE = 0.01), reinforcing the idea that larger groups engage in more diverse discursive content.

Table 4.

Group-Level HTCI Scores and Polarization Levels.

Group	Corresponding themes (highest similarity)	HTCI	SD	Polarization level	Interpretation
G1	Vulnerable groups	0.65	0.18	Low	Large, fragmented discussion
G2	Public awareness	0.68	0.15	Low	Broad, diverse engagement
G3	Law and politics	0.72	0.12	Moderate	Emerging thematic focus
. . .	. . .	. . .	. . .	. . .	. . .
G100	Vulnerable groups	0.78	0.09	Moderate-High	Cohesive but not extreme
. . .	. . .	. . .	. . .	. . .	. . .
G178	Persecution experiences	0.83	0.05	High	Tight-knit echo chamber
. . .	. . .	. . .	. . .	. . .	. . .
G356	Incident areas	0.86	0.02	Extreme High	Maximal topic alignment

Discussion

Online discussions on X regarding human trafficking are marked by multifaceted polarization, evident in both group polarization—the formation of distinct user clusters—and opinion polarization—the thematic segmentation of discourse. To comprehend these dynamics, social identity theory and self-categorization theory offer critical insights into how individuals affiliate with particular discussion groups, reinforce their collective identity, and selectively interact with information that aligns with their pre-existing beliefs. The case of human trafficking exemplifies how these theories elucidate not only the underlying mechanisms of polarization but also its impact on shaping public discourse and fostering collective action within digital environments.

According to social identity theory, individuals derive a sense of self from their membership in social groups, fostering ingroup cohesion and outgroup differentiation (Dutot, 2020). In online discussions on human trafficking, social media users tend to self-organize into communities that reflect their interests, values, and perspectives on the issue. For instance, groups focused on victim advocacy primarily engage with narratives centered on survivors’ experiences, whereas those emphasizing law enforcement concentrate on legal frameworks and criminal networks. Although these groups may share a common concern regarding human trafficking, their distinct identity-based framing results in minimal interaction across group boundaries, thereby reinforcing discursive segregation. As self-categorization theory suggests, this process enhances within-group consensus, making individuals more resistant to alternative perspectives (Flanagin et al., 2014).

This dynamic is particularly pronounced in opinion leader-driven clusters, where high-profile figures (e.g. activists, policymakers, journalists) shape the discourse within their communities, further solidifying ideological boundaries. Influential users—identified through measures such as centrality and PageRank—play a pivotal role in structuring discourse by amplifying ingroup perspectives while filtering or countering outgroup narratives. This leads to varying degrees of polarization, where some groups remain encapsulated within ideological silos, while others serve as bridges between differing viewpoints. Ultimately, these findings emphasize how social identity processes influence online discussions about human trafficking, reinforcing group-based divisions that shape the spread of information, engagement patterns, and the overall structure of online discourse.

While group polarization fosters social clustering, opinion polarization reflects cognitive segmentation, where users selectively engage with specific aspects of human trafficking while disregarding others. Self-categorization theory explains that individuals categorize not only themselves but also information, leading to thematic silos in online discussions (Hogg & Terry, 2000). For instance, some online communities prioritize discussions about forced labor, whereas others focus on sex trafficking, legal frameworks, or human rights advocacy. This thematic division reflects structural polarization, which—unlike affective or ideological polarization—highlights the fragmentation of discourse along issue-specific lines without necessarily implying emotional hostility or opposing ideologies. For example, although both groups—one advocating for strict anti-trafficking laws and the other emphasizing harm reduction approaches for vulnerable populations—seek to address the managerial issue, their discourse remains parallel rather than intersecting, thus limiting opportunities for collaborative solutions.

Furthermore, we find that users tend to engage with content that aligns with their pre-existing identity, thereby fostering selective exposure and confirmation bias. For instance, a group emphasizing survivor narratives may disregard discussions centered on criminal justice perspectives, perceiving them as overly punitive. Conversely, groups focused on law enforcement solutions may dismiss advocacy-driven narratives as overly emotionally charged or politically biased. This mutual disengagement diminishes the likelihood of constructive dialogue between perspectives that could, in fact, be complementary rather than adversarial. Moreover, the presence of high-engagement influencers within each community amplifies identity alignment, as users rely on opinion leaders to establish in-group norms and define acceptable discursive boundaries. Over time, these boundaries become entrenched, reinforcing discursive silos that impede a holistic understanding of human trafficking.

The resulting similarity matrix provides a detailed overview of the content overlap between clustered groups and identified topics, elucidating how users’ engagement with human trafficking discussions is shaped by their social affiliations and cognitive biases. In highly polarized environments, similarity values tend to be sparse, with high values concentrated within specific topics, whereas more fluid discussions exhibit moderate similarities distributed across multiple topics.

Notably, larger groups exhibit a more dispersed focus across various topics, whereas smaller groups demonstrate more concentrated engagement with specific themes. This pattern suggests that as the number of users and interactions within a cluster increases, topic diversity and divergent thinking intensify. In contrast, smaller clusters with fewer interactions tend to concentrate intensely on specific topics, potentially reinforcing ideological homogeneity and stronger polarization. The analysis of the similarity scores reveals that certain groups exhibit a high degree of alignment with particular narratives, reinforcing shared identities and collective beliefs, while others display considerable divergence in focus. This pattern aligns with the proposition that social identity fosters group cohesion, strengthening users’ commitment to specific thematic frames while limiting engagement with broader, integrative perspectives on the issue (Fritsche et al., 2018).

Implications

Theoretical implications

This study makes a significant contribution to the field of online polarization research by integrating perspectives from social identity and self-categorization in examining public perceptions of human trafficking on social media platforms. We propose a conceptual framework that encompasses a multifaceted analysis of polarization.

The modeling of clustering and group polarization provides valuable insights into group dynamics, enabling the identification of subgroups with shared interests and potentially revealing echo chambers within social networks. Network metrics further underscore the role of influential nodes in shaping information diffusion, exerting structural control, and maintaining network cohesion.

Our study advances the literature through the application of BERTopic, an advanced topic modeling method, for identifying opinion polarization patterns related to human trafficking. Although previous studies have utilized BERTopic to detect signatures of problem gambling in online communication data (Smith et al., 2023) and for thematic characterization (Jeon et al., 2023), its potential for examining opinion polarization on social media remains largely unexplored.

Finally, this study contributes to a deeper understanding of multifaceted polarization in online discourse on human trafficking by analyzing both group and opinion polarization. Using TF-IDF vectors and cosine similarity, we assess the alignment between user-generated content within clustered groups and identified topics, revealing patterns of thematic coherence and overlap. The HTCI captures topic concentration within each group, while the WMPI assesses overall polarization on the issue by accounting for both group size and thematic alignment. Collectively, these methods offer a rigorous analytical framework for examining the dynamics of online polarization and its broader implications for public discourse.

Practical implications

This research advances scholarly understanding in the fields of international affairs and crisis management by addressing the complex and pervasive issue of human trafficking, which entails profound societal consequences. Understanding and combating human trafficking is crucial for protecting human rights, strengthening law enforcement and victim support systems, raising public awareness, informing policy development, and addressing the root causes of exploitation.

Our approach to group clustering facilitates the identification of influential individuals, groups, or organizations within the network, thereby amplifying anti-trafficking messages and supporting resource mobilization. Furthermore, the examination of group polarization enables an assessment of the direction and strength of information flow within the network, which is strategically valuable for designing and implementing targeted interventions.

Our BERTopic-based approach to topic identification in the context of human trafficking represents a substantial methodological advancement in automated topic classification. This method enables the detection of emerging trends and salient issues associated with this pressing social problem. By assessing the prevalence of specific topics or terms, researchers can evaluate public awareness and the depth of understanding across multiple dimensions of the issue. Consequently, government agencies and advocacy organizations can allocate resources more effectively by prioritizing areas associated with heightened public engagement.

Finally, the HTCI and WMPI provide actionable metrics to assess the concentration and polarization of discourse, allowing stakeholders to design targeted interventions that foster balanced dialogue and mitigate ideological fragmentation. In addition, these tools can be applied to other contentious social issues, offering a transferable framework for understanding and addressing the dynamics of online polarization in public discourse.

Conclusion

Human trafficking, as a heinous and illicit transnational activity, raises substantial public concern and demands a comprehensive approach to ensure global public security. This study fills the research gap regarding the influence of social media platforms on the multifaceted polarization of global affairs, offering new insights and establishing a basis for continued research in this dynamic field. Rather than being a panacea, social media presents new layers of complexity that warrant systematic examination to foster a more informed and balanced discussion on international issues.

Limitations and future research

This study has several limitations that should be acknowledged. First, the data were drawn solely from the social media platform X, potentially limiting the generalizability of the clustering and topic modeling results. Different platforms, such as Facebook, YouTube, and Instagram, exhibit distinct patterns of information dissemination, particularly in terms of user interactions and the formation of emergent groups. To enhance the validity of our findings, future research should incorporate additional platforms to assess online polarization in a broader context. Second, the BERTopic model used for topic classification has demonstrated efficiency exclusively on our English-language dataset. To evaluate its general applicability, future studies should test the model on non-English texts. Third, while HDBSCAN allows the BERTopic model to approximate topic distribution within a document using the probability matrix, this method only partially resolves the issue and does not adequately account for documents that contain multiple topics. To address this limitation, future work should aim to develop an adapted version of HDBSCAN that can more effectively manage topic modeling in multi-topic scenarios. Finally, although our interpretation draws on social identity theory to explain patterns of group polarization and discourse alignment, it is important to note that identity processes are inferred from observed group structures and communication patterns rather than directly measured. Future research should incorporate direct measures of identity salience and group affiliation to validate and extend these findings.

Footnotes

Appendix 1

Themes	Topics	Top keywords (c-TF-IDF score)	Proportion
Vulnerable groups	0	(“children,” 0.1053), (“child,” 0.05990), (“women,” 0.0470), (“parents,” 0.03644), (“families,” 0.02915)	6.98%
	9	(“women,” 0.0237), (“girls,” 0.0186), (“rights,” 0.0164), (“people,” 0.0122), (“humantrafficking,” 0.0108)	4.49%
	17	(“ohio,” 0.0427), (“teacher,” 0.0285), (“kids,” 0.0271), (“victims,” 0.0244), (“vulnerable,” 0.0226)	3.11%
	18	(“tate,” 0.0237), (“andrew,” 0.0186), (“brother,” 0.0164), (“slaves,” 0.0645), (“survivors,” 0.0501)	2.98%
Persecution experience	4	(“porn,” 0.106), (“pedophilia,” 0.104), (“rape,” 0.0912), (“harvesting,” 0.0868), (“sextrafficking,” 0.03789)	4.95%
	6	(“slavery,” 0.0125), (“endworkness,” 0.0112), (“salve,” 0.0009), (“remorse,” 0.0008), (“satanism,” 0.0008)	4.88%
	13	(“podcast,” 0.1012), (“suffering,” 0.0998), (“tour,” 0.0973), (“torture,” 0.0885), (“trauma,” 0.0699)	3.30%
	27	(“interview,” 0.1146), (“trafficking,” 0.0667), (“abuse,” 0.0498), (“pedophilia,” 0.0469), (“organ,” 0.0354)	0.84%
Incident area	1	(“Israel,” 0.0806), (“Israeli,” 0.0655), (“Hamas,” 0.0453), (“boundaries,” 0.0409), (“America,” 0.0403)	6.37%
	7	(“Ukraine,” 0.1053), (“bio,” 0.0599), (“labs,” 0.0470), (“Kenya,” 0.0365), (“Rwanda,” 0.0292)	4.62%
	8	(“organ,” 0.0868), (“harvesting,” 0.861), (“China,” 0.0523), (“Arizona,” 0.0488), (“Ukr,” 0.0407)	4.50%
	14	(“Rwanda,” 0.0139), (“refugees,” 0.0132), (“African,” 0.0116), (“headmen,” 0.0095), (“NYC,” 0.0090)	3.29%
Law and politics	5	(“charges,” 0.0511), (“crime,” 0.0428), (“law,” 0.0406), (“commissioner,” 0.0404), (“unvetted,” 0.0371)	4.90%
	22	(“power,” 0.1542), (“executive,” 0.1316), (“ballot,” 0.1310), (“police,” 0.0739), (“operating,” 0.0547)	1.96%
	23	(“republications,” 0.1141), (“voted,” 0.0647), (“Gaetz,” 0.0542), (“federal,” 0.0406), (“illegals,” 0.0327)	1.89%
	26	(“challenge,” 0.0971), (“supreme,” 0.0450), (“court,” 0.0363), (“trade,” 0.0345), (“empire,” 0.0299)	0.96%
Public consciousness	3	(“claytonconway65,” 0.0385), (“elonmusk,” 0.0254), (“Raph,” 0.0238), (“accountability,” 0.0231), (“contract,” 0.0211)	5.54%
	10	(“community,” 0.0965), (“awareness,” 0.0867), (“join,” 0.0821), (“WalkForFreedom,” 0.0456), (“preventing,” 0.435)	3.45%
	11	(“GregAbbott,” 0.1806), (“TX,” 0.1149), (“Texas,” 0.1034), (“disgraceful,” 0.0823), (“truth,” 0.0823)	3.40%
	19	(““Fentanyl”,” 0.1931), (“Jordan,” 0.1580), (“Denomi,” 0.1538), (“extremism,” 0.1531), (“strength,” 0.1465)	2.97%
Contraband	2	(“border,” 0.2653), (“cartels,” 0.0860), (“drug,” 0.0673), (“videos,” 0.0583), (“indoctrination,” 0.4904)	5.85%
	16	(“EarnInDay,” 0.3244), (“EarnIn,” 0.2884), (“combat,” 0.2714), (“drug,” 0.2489), (“operation,” 0.2308)	3.13%
	21	(“smuggling,” 0.1589), (“migrant,” 0.1456), (“trafficking,” 0.1279), (“money,” 0.1272), (“fraud,” 0.1125)	2.52%
	25	(“arrested,” 0.0949), (“biolabs,” 0.0471), (“death,” 0.0471), (“slaves,” 0.0639), (“import,” 0.0500)	1.24%
Case events	12	(“Jadi,” 0.8789), (“NG,” 0.7802), (“korban,” 0.7654), (“Tristan,” 0.7226), (“MikeGil,” 0.7223)	3.40%
	15	(“POTUS,” 0.1706), (“Trump,” 0.0856), (“Donald,” 0.0611), (“Travis,” 0.0602), (“Diana,” 0.0524)	3.22%
	20	(“biden,” 0.3433), (“cartels,” 0.0652), (“border,” 0.0624), (“boundaries,” 0.0621), (“Hillary,” 0.0592)	2.67%
	24	(“Peru,” 0.3475), (“43,” 0.2285), (“syndicate,” 0.1168), (“YouTube,” 0.1036), (“released,” 0.0926)	1.75%
	28	(“biden,” 0.1992), (“Joe,” 0.1613), (“oversees,” 0.1295), (“Hollywood,” 0.0949), (“accused,” 0.0942)	0.84%

Acknowledgements

There are no acknowledgments to declare.

ORCID iDs

Yunfei Xing

Justin Zuopeng Zhang

Ethical Considerations

No human participants or animals were involved in the experimental procedures conducted for this research.

Consent to Participate

All respondents provided informed consent before enrollment in the study.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Department of Education of Jilin Province (JJKH20241362SK).

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

Data will be made available on request.

Author biographies

Yunfei Xing is an Associate Professor in the School of Business and Management at Jilin University, China. She received her PhD in School of Management from Jilin University. Her research interests include social media analysis, big data analytics, public opinion, user behavior, and machine learning. Her articles appeared in Technology in Society, Journal of Information Science, and Journal of Enterprise Information Management, among others.

Justin Zuopeng Zhang is a faculty member in the Coggin College of Business at University of North Florida. He received his PhD in Business Administration with a concentration on Management Science and Information Systems from Pennsylvania State University, University Park. His research interests include economics of information systems, knowledge management, electronic business, business process management, information security, and social networking. He is the editor-in-chief of the Journal of Global Information Management, an ABET program evaluator, and an IEEE senior member.

References

Agrawal

Roy

Mitra

(2021). Tag embedding based personalized point of interest recommendation system. Information Processing & Management, 58(6), 102690. https://doi.org/10.1016/j.ipm.2021.102690

Ali

Mathew

Mordeson

J. N.

(2021). Hamiltonian fuzzy graphs with application to human trafficking. Information Sciences, 550, 268–284. https://doi.org/10.1016/j.ins.2020.10.029

Horváth

Sheng

Song

Sun

(2023). Skill requirements in job advertisements: A comparison of skill-categorization methods based on wage regressions. Information Processing & Management, 60(2), 103185. https://doi.org/10.1016/j.ipm.2022.103185

Ardabili

B. R.

Danesh Pazho

Alinezhad Noghre

Katariya

Hull

Reid

Tabkhi

(2024). Exploring public’s perception of safety and Video Surveillance Technology: A survey approach. Technology in Society, 78, 102641. https://doi.org/10.1016/j.techsoc.2024.102641

Ashforth

B. E.

Mael

(1989). Social identity theory and the organization. The Academy of Management Review, 14(1), 20. https://doi.org/10.2307/258189

Bail

C. A.

Argyle

L. P.

Brown

T. W.

Bumpus

J. P.

Chen

Hunzaker

M. B.

Lee

Mann

Merhout

Volfovsky

(2018). Exposure to opposing views on social media can increase political polarization. Proceedings of the National Academy of Sciences, 115(37), 9216–9221. https://doi.org/10.1073/pnas.1804840115

Boler

Kweon

Y.-J.

Threasaigh

M. N.

(2024). Digital affect culture and the logics of melodrama: Online polarization and the January 6 capitol riots through the lens of genre and affective discourse analysis. Social Media + Society, 10(1), 20563051241228584. https://doi.org/10.1177/20563051241228584

Buder

Zimmermann

Buttliere

Rabl

Vogel

Huff

(2023). Online interaction turns the congeniality bias into an uncongeniality bias. Psychological Science, 34(10), 1055–1068. https://doi.org/10.1177/09567976231194590

Burgers

Beukeboom

C. J.

Smith

P. A. L.

van Biemen

(2023). How live twitter commentaries by professional sports clubs can reveal intergroup dynamics. Computers in Human Behavior, 139, 107528. https://doi.org/10.1016/j.chb.2022.107528

10.

Cao

(2019). Exploring the influence of excessive social media use at work: A three-dimension usage perspective. International Journal of Information Management, 46, 83–92. https://doi.org/10.1016/j.ijinfomgt.2018.11.019

11.

Celikten

Onan

(2025). Topic modeling through rank-based aggregation and LLMS: An approach for AI and human-generated scientific texts. Knowledge-Based Systems, 314, 113219. https://doi.org/10.1016/j.knosys.2025.113219

12.

Chen

Wang

Yang

Cong

(2021). Modeling multidimensional public opinion polarization process under the context of derived topics. International Journal of Environmental Research and Public Health, 18(2), 472. https://doi.org/10.3390/ijerph18020472

13.

Cho

S.-Y.

(2013). Integrating equality: Globalization, women’s rights, and human trafficking. International Studies Quarterly, 57(4), 683–697. https://doi.org/10.1111/isqu.12056

14.

Cinelli

Etta

Franzoni

Mancini

Niewiadomski

(2024). From polarization to pro-sociality: Measuring beneficence in controversial online conversations. IEEE Access, 12, 102851–102861. https://doi.org/10.1109/access.2024.3430495

15.

Clauset

Newman

M. E.

Moore

(2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/physreve.70.066111

16.

Colleoni

Rozza

Arvidsson

(2014). Echo chamber or public sphere? Predicting political orientation and measuring political homophily in Twitter using Big Data. Journal of Communication, 64(2), 317–332. https://doi.org/10.1111/jcom.12084

17.

Combs

Tierney

Guay

Merhout

Bail

C. A.

Hillygus

D. S.

Volfovsky

(2023). Reducing political polarization in the United States with a mobile chat platform. Nature Human Behaviour, 7(9), 1454–1461. https://doi.org/10.1038/s41562-023-01655-0

18.

Dutot

(2020). A social identity perspective of social media’s impact on satisfaction with life. Psychology & Marketing, 37(6), 759–772. https://doi.org/10.1002/mar.21333

19.

Flanagin

A. J.

Hocevar

K. P.

Samahito

S. N.

(2014). Connecting with the user-generated web: How group identification impacts online information sharing and evaluation. Information, Communication & Society, 17(6), 683–694. https://doi.org/10.1080/1369118x.2013.808361

20.

Fritsche

Barth

Jugert

Masson

Reese

(2018). A Social Identity Model of Pro-Environmental Action (SIMPEA). Psychological Review, 125(2), 245–269. https://doi.org/10.1037/rev0000090

21.

Guess

A. M.

Malhotra

Pan

Barberá

Allcott

Brown

Crespo-Tenorio

Dimmery

Freelon

Gentzkow

González-Bailón

Kennedy

Kim

Y. M.

Lazer

Moehler

Nyhan

Rivera

C. V.

Settle

Thomas

D. R.

. . Tucker

J. A.

(2023a). How do social media feed algorithms affect attitudes and behavior in an election campaign? Science, 381(6656), 398–404. https://doi.org/10.1126/science.abp9364

22.

Guess

A. M.

Malhotra

Pan

Barberá

Allcott

Brown

Crespo-Tenorio

Dimmery

Freelon

Gentzkow

González-Bailón

Kennedy

Kim

Y. M.

Lazer

Moehler

Nyhan

Rivera

C. V.

Settle

Thomas

D. R.

. . Tucker

J. A.

(2023b). Reshares on social media amplify political news but do not detectably affect beliefs or opinions. Science, 381(6656), 404–408. https://doi.org/10.1126/science.add8424

23.

Haq

S. U.

Kwok

R. Y.

(2024). Encountering “the other” in religious social media: A cross-cultural analysis. Social Media + Society, 10(4), 20563051241303363. https://doi.org/10.1177/20563051241303363

24.

Hogg

M. A.

Terry

D. J.

(2000). Social identity and self-categorization processes in organizational contexts. The Academy of Management Review, 25(1), 121–140. https://doi.org/10.2307/259266

25.

Hogg

M. A.

Turner

J. C.

(1987). Intergroup behaviour, self-stereotyping and the salience of social categories. British Journal of Social Psychology, 26(4), 325–340. https://doi.org/10.1111/j.2044-8309.1987.tb00795.x

26.

Jeon

Yoon

Sohn

S. Y.

(2023). Exploring new digital therapeutics technologies for psychiatric disorders using Bertopic and Patentsberta. Technological Forecasting and Social Change, 186, 122130. https://doi.org/10.1016/j.techfore.2022.122130

27.

Kasban

Nassar

(2020). An efficient approach for forgery detection in digital images using Hilbert–Huang Transform. Applied Soft Computing, 97, 106728. https://doi.org/10.1016/j.asoc.2020.106728

28.

Keighley

Sanders

(2023). Prevention of modern slavery within sex work: Study protocol of a mixed methods project looking at the role of adult services websites. PLOS ONE, 18(5), Article e0285829. https://doi.org/10.1371/journal.pone.0285829

29.

Kejriwal

Szekely

(2022). Knowledge graphs for social good: An entity-centric search engine for the human trafficking domain. IEEE Transactions on Big Data, 8(3), 592–606. https://doi.org/10.1109/tbdata.2017.2763164

30.

Khawar

Boukes

(2024). Analyzing sensationalism in news on Twitter (X): Clickbait journalism by legacy vs. online-native outlets and the consequences for user engagement. Digital Journalism. Advance online publication. https://doi.org/10.1080/21670811.2024.2394764

31.

Koschate

Naserian

Dickens

Stuart

Russo

Levine

(2021). Asia: Automated social identity assessment using linguistic style. Behavior Research Methods, 53(4), 1762–1781. https://doi.org/10.3758/s13428-020-01511-3

32.

Kranrattanasuit

(2024). Utilising the communication for development approach to prevent online child trafficking in Thailand. Humanities and Social Sciences Communications, 11(1), 197. https://doi.org/10.1057/s41599-024-02614-4

33.

Lane

D. S.

Moxley

C. M.

McLeod

(2023). The group roots of social media politics: Social sorting predicts perceptions of and engagement in politics on social media. Communication Research, 50(7), 904–932. https://doi.org/10.1177/00936502231161400

34.

Lee

J. K.

Choi

Kim

(2014). Social media, network heterogeneity, and opinion polarization. Journal of Communication, 64(4), 702–722. https://doi.org/10.1111/jcom.12077

35.

(2022). Identity construction in social media: A study on blogging continuance. Behaviour & Information Technology, 41(8), 1671–1688. https://doi.org/10.1080/0144929x.2021.1895319

36.

Mak

Bentley

Paphitis

Huq

Zimmerman

Osrin

Devakumar

Abas

Kiss

(2023). Psychosocial interventions to improve the mental health of survivors of human trafficking: A realist review. The Lancet Psychiatry, 10(7), 557–574. https://doi.org/10.1016/s2215-0366(23)00105-0

37.

Malik

D. S.

Mathew

Mordeson

J. N.

(2018). Fuzzy incidence graphs: Applications to human trafficking. Information Sciences, 447, 244–255. https://doi.org/10.1016/j.ins.2018.03.022

38.

Marchetti

Stanziano

Mincigrucci

Pagiotti

(2024). Bias and polarization in the Qatargate scandal: A social media perspective. Social Media + Society, 10(4), 20563051241306323. https://doi.org/10.1177/20563051241306323

39.

Masullo

G. M.

(2023). A new solution to political divisiveness: Priming a sense of common humanity through Facebook meme-like posts. New Media & Society, 27(2), 808–827. https://doi.org/10.1177/14614448231184633

40.

McNally

Bastos

(2025). The news feed is not a black box: A longitudinal study of facebook’s algorithmic treatment of news. Digital Journalism. Advance online publication. https://doi.org/10.1080/21670811.2025.2450623

41.

Menezes

S. M.

Kumar

Dutta

(2024). Navigating in turbulent times: Using social media to examine small and family-owned business topics and sentiments during the COVID-19 crisis. Information Systems Frontiers. Advance online publication. https://doi.org/10.1007/s10796-024-10542-6

42.

Messing

Westwood

S. J.

(2012). Selective exposure in the age of social media. Communication Research, 41(8), 1042–1063. https://doi.org/10.1177/0093650212466406

43.

Moran

R. E.

Prochaska

(2023). Misinformation or activism? Analyzing networked moral panic through an exploration of #SaveTheChildren. Information, Communication & Society, 26(16), 3197–3217. https://doi.org/10.1080/1369118x.2022.2146986

44.

Nyhan

Settle

Thorson

Wojcieszak

Barberá

Chen

A. Y.

Allcott

Brown

Crespo-Tenorio

Dimmery

Freelon

Gentzkow

González-Bailón

Guess

A. M.

Kennedy

Kim

Y. M.

Lazer

Malhotra

Moehler

. . Tucker

J. A.

(2023). Like-minded sources on Facebook are prevalent but not polarizing. Nature, 620(7972), 137–144. https://doi.org/10.1038/s41586-023-06297-w

45.

Overgaard

C. S.

(2024). Perceiving affective polarization in the United States: How social media shape meta-perceptions and affective polarization. Social Media + Society, 10(1), 20563051241232662. https://doi.org/10.1177/20563051241232662

46.

Pfeffer

Barrick

Galvan

Marfori

F. M.

Williams

S. A.

(2024). “I’d rather be broke than harmed”: A qualitative analysis of the experiences of people engaged in commercial sex work during the COVID-19 pandemic. Public Health Reports®, 140(Suppl. 1), 61S–66S. https://doi.org/10.1177/00333549241236079

47.

Phillips

S. C.

Carley

K. M.

(2024). An organizational form framework to measure and interpret online polarization. Information, Communication & Society, 27(6), 1163–1195. https://doi.org/10.1080/1369118x.2023.2240580

48.

Powell

A. B.

(2024). Objectivity vs affect: How competing forms of legitimacy can polarize public debate in data-driven public consultation. Information, Communication & Society. Advance online publication. https://doi.org/10.1080/1369118x.2024.2329623

49.

Preble

K. M.

Nichols

Cox

(2022). Working with survivors of human trafficking: Results from A needs assessment in a Midwestern State, 2019. Public Health Reports®, 137(Suppl. 1), 111S–118S. https://doi.org/10.1177/00333549221089254

50.

Qian

T. Y.

Seifried

(2023). Virtual interactions and sports viewing on social live streaming platforms: The role of co-creation experiences, platform involvement, and follow status. Journal of Business Research, 162, 113884. https://doi.org/10.1016/j.jbusres.2023.113884

51.

Renstrom

E. A.

Back

Carroll

(2023). Threats, emotions, and affective polarization. Political Psychology, 44(6), 1337–1366. https://doi.org/10.1111/pops.12899

52.

Risius

Blasiak

K. M.

Wibisono

Louis

W. R.

(2024). The digital augmentation of extremism: Reviewing and guiding online extremism research from a sociotechnical perspective. Information Systems Journal, 34(3), 931–963.

53.

Robertson

R. E.

Green

Ruck

D. J.

Ognyanova

Wilson

Lazer

(2023). Users choose to engage with more partisan news than they are exposed to on Google search. Nature, 618(7964), 342–348. https://doi.org/10.1038/s41586-023-06078-5

54.

Savoia

Piltch-Loeb

Muibu

Leffler

Hughes

Montrond

(2023). Reframing human trafficking awareness campaigns in the United States: Goals, audience, and content. Frontiers in Public Health, 11, Article 1195005. https://doi.org/10.3389/fpubh.2023.1195005

55.

Simmons

B. A.

Lloyd

Stewart

B. M.

(2018). The global diffusion of law: Transnational crime and the case of human trafficking. International Organization, 72(2), 249–281. https://doi.org/10.1017/s0020818318000036

56.

Smith

Reiter

Peters

(2023). Automatic detection of problem-gambling signs from online texts using large language models. arXiv. https://arxiv.org/abs/2312.00804

57.

Song

Merlin

Rodriguez

(2013). Comparing measures of urban land use mix. Computers, Environment and Urban Systems, 42, 1–13. https://doi.org/10.1016/j.compenvurbsys.2013.08.001

58.

Stollwerck

E. A. T.

Rollmann

Friederich

H. C.

Nikendei

(2024). Responding to human trafficking among refugees: Prevalence and test accuracy of a modified version of the adult human trafficking screening tool. BMC Public Health, 24(1), Article 1685. https://doi.org/10.1186/s12889-024-18997-7

59.

Such

Campos-Matos

Hayes

McCoig

Thornton

Woodward

(2024). A public health approach to modern slavery in the United Kingdom: A codeveloped framework. Public Health, 232, 146–152. https://doi.org/10.1016/j.puhe.2024.04.004

60.

Tajfel

Turner

(1979). An integrative theory of intergroup conflict. In Austin

W. G.

Worschel

(Eds.), The social psychology of intergroup relations (pp. 33–47). Brooks/Cole Publishing.

61.

Tsai

H.-T.

Bagozzi

R. P.

(2014). Contribution behavior in virtual communities: Cognitive, emotional, and social influences. MIS Quarterly, 38(1), 143–163. https://doi.org/10.25300/misq/2014/38.1.07

62.

Van Buren

H. J.

Schrempf-Stirling

Westermann-Behaylo

. (2019). Business and human trafficking: A social connection and political responsibility model. Business & Society, 60(2), 341–375. https://doi.org/10.1177/0007650319872509

63.

Velasquez

Montgomery

(2020). Social media expression as a collective strategy: How perceptions of discrimination and group Status Shape Us latinos’ online discussions of immigration. Social Media + Society, 6(1), 2056305120914009. https://doi.org/10.1177/2056305120914009

64.

Wakefield

R. L.

Wakefield

(2023). The antecedents and consequences of intergroup affective polarisation on Social Media. Information Systems Journal, 33(3), 640–668. https://doi.org/10.1111/isj.12419

65.

Wang

(2023). Joint liability and aggravation? An inspection of legislative and judicial practices in cases of the crime of the abduction, sale, and purchase of women and children in China. Humanities and Social Sciences Communications, 10(1), 785. https://doi.org/10.1057/s41599-023-02218-4

66.

Wuestenenk

van Tubergen

Stark

T. H.

(2025). The influence of group membership on online expressions and polarization on a discussion platform: An experimental study. New Media & Society, 27, 225–245. https://doi.org/10.1177/14614448231172966

67.

Xing

Wang

Qiu

(2022). Research on opinion polarization by Big Data Analytics capabilities in online social networks. Technology in Society, 68, 101902. https://doi.org/10.1016/j.techsoc.2022.101902

68.

Xing

Zhang

J. Z.

(2025). Metaverse maelstrom: Dissecting information dynamics and polarisation. Journal of Information Science. Advance online publication. https://doi.org/10.1177/01655515241307546

69.

Xing

Zhang

J. Z.

Teng

Zhou

(2024). Voices in the digital storm: Unraveling online polarization with chatgpt. Technology in Society, 77, 102534. https://doi.org/10.1016/j.techsoc.2024.102534

70.

Yang

Jiao

(2021). Influential factors on collective anxiety of online topic-based communities. Frontiers in Psychology, 12, Article 740065. https://doi.org/10.3389/fpsyg.2021.740065

71.

Zimmerman

Hossain

Watts

(2011). Human trafficking and health: A conceptual model to inform policy, intervention and research. Social Science & Medicine, 73(2), 327–335. https://doi.org/10.1016/j.socscimed.2011.05.028

Shadows and Light: Unveiling Multifaceted Polarization in Social Media Discourse on Human Trafficking

Abstract

Keywords

Introduction

Literature review

Online polarization in social media

Social identity and self-categorization

Methodology

Data collection and processing

Modeling multifaceted polarization

Group clustering

Topic identification

Document embedding

Documents clustering

Topic representation

Similarity/dissimilarity

Multifaceted polarization

Results

Clustering network and group polarization

Topic identification and opinion polarization

Similarity analysis and multifaceted polarization

Discussion

Implications

Theoretical implications

Practical implications

Conclusion

Limitations and future research

Footnotes

Appendix 1

Acknowledgements

ORCID iDs

Ethical Considerations

Consent to Participate

Funding

Declaration of conflicting interests

Data availability statement

Author biographies

References