Abstract
Objective
This study aimed to develop a functional typology of authoritative breast cancer–related key opinion leaders (ABKOLs) on Chinese social media and to examine how platform-specific dynamics shape their content strategies and audience emotional responses.
Methods
Videos from selected ABKOL accounts on Douyin and Xiaohongshu were transcribed using automated speech recognition and processed through natural language cleaning. Latent Dirichlet Allocation was used to extract semantic themes, K-means clustering was applied to identify functional types, and sentiment analysis was conducted to assess emotional patterns in user comments.
Results
A total of 19,960 videos from 30 ABKOLs were collected (17,302 from Douyin, 2658 from Xiaohongshu), of which 19,043 valid transcripts were retained for analysis. Four functional types were identified: preventive advocates (Cluster 1,
Conclusion
This study revealed the functional roles, platform distributions, and emotional impacts of various ABKOL types. The findings underscore the importance of aligning content structures and emotional narratives with platform algorithms and audience expectations to optimize the effectiveness of social media–based cancer communication.
Keywords
Introduction
Breast cancer is the most common cancer among women worldwide. According to GLOBOCAN data, there were 2.3 million new cases in 2022, accounting for 11.6% of all cancers. 1 Although mortality has declined, patients continue to experience significant informational and psychological stress, including anxiety, depression, and uncertainty, which severely affect treatment adherence and quality of life.2,3 Additionally, breast cancer patients in noncore cities and rural areas have long faced disparities in healthcare resource allocation and limited access to care. A survey by Wang et al. involving 1570 cancer patients found that 84.1% of rural patients sought medical attention only after symptom onset, and they were significantly less likely to seek care across multiple hospitals compared to urban patients. 4 Against this background, social media has increasingly served as an important source for breast cancer patients to access expert knowledge, emotional support, and recovery guidance. 5
Survey data indicate that 34%–49% of breast cancer patients obtain cancer-related information via the internet. 6 Health-related video and graphic content on social media platforms has grown substantially, with data from TikTok indicating a more than 600% increase in health and wellness content in 2021 and over 3.8 million healthcare providers actively contributing to the platform's dissemination efforts.7,8 This increase corresponds to the continuous improvement in users’ digital health literacy and their growing demand for digital health resources. 9 Digital education through social media platforms has been shown to be effective in improving patients’ disease understanding and education levels, while also enhancing their emotional awareness and social quality of life. 10 Taking China as an example, Douyin and Xiaohongshu are the two most prominent social video platforms, attracting many breast cancer patients seeking medical information and emotional assistance.11,12 Although they exhibit significant differences in user profiles, content styles, and engagement patterns, Douyin has approximately 766 million monthly active users across age groups, with 60%–80% aged 18–35 and around 260 million users aged over 60. 13 In contrast, Xiaohongshu, with around 200–300 million monthly active users, primarily appeals to young women aged 18–35. 14 Douyin relies more on algorithm-driven short video recommendations that emphasize visual and auditory stimulation and instant engagement, while Xiaohongshu, which integrates image- and video-based notes, focuses on lifestyle sharing, emotional empathy, and community building—thus excelling in deeper interaction and user retention.15,16
Authoritative breast cancer–related key opinion leaders (ABKOLs) on social media primarily consist of oncologists, nurses, and medically trained content creators. These individuals are playing increasingly central roles in digital health communication through a combination of clinical expertise, relatable communication style, and sustained content dissemination. 17 Research suggests that professional health content creators, by aligning medical credibility with consistent audience engagement, significantly enhance the trustworthiness and reach of health information, thereby improving public understanding, increasing the willingness to apply health knowledge, and ultimately influencing treatment decisions and health behaviors. 18 On platforms such as Douyin and Xiaohongshu, ABKOLs with verified medical credentials serve as crucial sources of health guidance and emotional support for patients. 19 For instance, a study on Douyin found that over 80% of breast cancer screening-related videos were uploaded by healthcare professionals, who received higher reliability and quality ratings. 20 Similarly, Yu et al.'s analysis of 1824 Xiaohongshu posts on emerging infectious diseases revealed that content using structured language and referencing expert sources garnered significantly greater engagement. 21 However, the overwhelming volume and inconsistent quality of online health content continue to pose substantial challenges for breast cancer patients. 22 Existing studies have largely concentrated on the influence of ABKOLs or content analysis limited to individual platforms, while offering limited insight into the functional roles of content types and their associated emotional impacts—particularly in the context of comparative cross-platform analyses. 23
Therefore, this study focused on ABKOLs by extracting representative posts and comment data from the Douyin and Xiaohongshu platforms. Latent Dirichlet Allocation (LDA) was employed to identify semantic themes, and K-means clustering was applied to develop functional profiles of ABKOLs. Furthermore, sentiment analysis was conducted on both first- and second-level user comments to uncover differences in emotional response patterns elicited by ABKOLs across various platforms and functional types. These findings offer valuable insights into how platform-specific ecosystems and the functional positioning of ABKOLs collaboratively shape health communication dynamics within the breast cancer context.
Methods
Data sources and identification of ABKOLs
This study focused on breast cancer–related content and selected data from two leading Chinese social media platforms, Douyin and Xiaohongshu. Publicly available data were collected using Python-based web crawling scripts, which retrieved user profiles and video metadata containing any of the following keywords: “breast cancer,” “breast tumor,” “breast cancer specialist,” “breast cancer physician,” and “breast cancer nurse.” Candidate ABKOLs were then manually screened based on relevance and content specialization. The inclusion framework and selection principles were informed by previous research, including conceptual analyses, scoping reviews, and empirical profiling studies on oncology-related key opinion leaders, which identified engagement patterns and activity benchmarks among authoritative medical professionals on social media.17,24,25 The final inclusion criteria included: (1) the primary focus on breast cancer topics, encompassing patient education, therapeutic suggestions, postoperative management, and emotional support; (2) a minimum of 800 followers to ensure influence and engagement; this threshold was informed by an exploratory review of follower distribution and engagement patterns, in which lower cutoffs mainly captured low-interaction users, whereas higher cutoffs would substantially reduce representativeness, and is consistent with prior health communication research on social media influencers26,27; (3) evidence of recent and continuous content activity within the preceding three months; this activity window was selected to ensure temporal relevance and reflect ongoing engagement behaviors, as social media communication dynamics can evolve rapidly; and (4) clear professional identification as healthcare providers (e.g., oncologists, surgical specialists, or registered nurses) within their published materials. For ethical and privacy considerations, all ABKOL account identifiers were anonymized using numeric codes.
Data extraction and preprocessing
Data scraping was performed using Python along with relevant packages. GET requests were sent to target pages via the requests module, while the HTML content was parsed using the etree parser with XPath expressions to obtain metadata such as account IDs, textual captions, upload dates, and engagement statistics. For web environments featuring dynamic rendering or nested layouts, parallel threading and asynchronous task scheduling were employed to increase crawling efficiency. Measures to mitigate antiscraping restrictions included customizing HTTP headers, regulating request intervals, and adding randomized delays to reduce system blocking and sampling bias. Because the majority of source material consisted of short-form videos, all speech content was transcribed into text using the iFLYTEK online automatic speech recognition (ASR) platform, enabling downstream semantic analysis. The resulting textual content, together with its associated metadata, was compiled into a structured corpus for subsequent computational processing.
To prepare the text for modeling, a multistep preprocessing pipeline was applied. This included symbol normalization, removal of extraneous characters and emojis, and the elimination of HTML artifacts and redundant punctuation. A stop-word list derived from the Harbin Institute of Technology, complemented by manually curated terms, was used to filter out noninformative high-frequency words. Lexical unification was conducted to reduce redundancy caused by synonymous expressions. 28 Word segmentation was implemented using the Jieba tokenizer, with only meaningful nouns, verbs, and adjectives retained to support topic modeling.
Topic modeling and semantic interpretation
Topic modeling was conducted using the LDA algorithm within the Python-based Gensim framework. To enhance semantic transparency and parameter adjustment, the pyLDAvis module was utilized for visualization. 29 The number of topics was optimized by evaluating perplexity scores, with lower values indicating a more reliable model fit. The model output included a topic probability distribution vector for each document, representing its semantic relevance to each topic. 30 Topic labels were independently generated by two researchers with backgrounds in medicine and health communication to ensure accurate interpretation of the latent semantic structure.
Topic vector construction and K-means clustering
Following the identification of seven optimal topics, each ABKOL was represented by a seven-dimensional topic vector based on the proportion of their content assigned to each topic. The vectors were z-score normalized to eliminate scale differences prior to clustering. K-means clustering was performed using Euclidean distance, with a maximum of 300 iterations and 20 random initializations to ensure robust convergence. 31 The optimal number of clusters was determined using the elbow method, scree plot, and silhouette coefficient.
Platform-level comparison
Cross-tabulation of clustering results with ABKOL platform affiliation was performed to investigate platform-specific differences in the distribution of functional content. Specifically, this included: (1) calculating the frequency and proportion of each ABKOL cluster on Douyin and Xiaohongshu to identify dominant platform-specific patterns; and (2) comparing topic compositions across platforms to assess the alignment between platform ecology and ABKOL functional orientations.
Functional profiling and characterization of ABKOLs
To systematically explore the functional differentiation and communication orientation of various ABKOL types, this study integrated clustering outcomes with topic distributions from the LDA model. Representative ABKOLs were randomly selected from each cluster for content analysis, including post characteristics, keyword frequency, semantic patterns, and common linguistic styles. These attributes were synthesized to infer core communication purposes, interaction strategies, and audience targeting, which facilitated the classification of ABKOLs by function. To depict thematic focus across the seven topics, radar charts were constructed based on average topic weights per cluster, highlighting the overall semantic distribution and informing a generalizable framework for functional profiling.
Sentiment analysis of user comments
Sentiment evaluation was conducted on user comments associated with ABKOL content across Douyin and Xiaohongshu, utilizing the SnowNLP toolkit. Both primary and secondary comment layers were included, with standardized natural language preprocessing applied—comprising Chinese word segmentation via Jieba and the elimination of nonsemantic elements such as emojis, special symbols, and stop words using regular expression filtering. Each comment was assigned a sentiment score ranging from 0 to 1. Based on predefined thresholds, scores ≥0.6 were categorized as positive, ≤0.4 as negative, and values in between as neutral. 32 In this study, neutral comments were operationally defined as expressions that did not convey a clear emotional valence. To enhance the credibility of automated analysis, a sample set of comments underwent manual annotation, which demonstrated high concordance with machine-generated labels. Finally, sentiment proportions were aggregated and visualized for different ABKOL types to illustrate cross-category emotional response patterns.
Results
Textual data characteristics and topic modeling results
This analysis covered 30 ABKOL accounts selected from the Douyin and Xiaohongshu platforms, whose historical video materials were extensively retrieved for further analysis. Between February 2018 and April 2025, a total of 19,960 video entries were gathered, including 17,302 from Douyin and 2658 from Xiaohongshu. All audiovisual materials were converted into textual form using ASR tools and then processed through a standardized preprocessing pipeline for cleaning and formatting. A final set of 19,043 high-quality text transcripts was retained for downstream topic modeling, containing a total of 5,848,238 words. In addition, 241,569 first-level and 211,341 second-level comments were collected and analyzed for sentiment to examine users’ emotional responses and interaction patterns toward different ABKOL types.
Seven coherent topics were extracted using the LDA model: postoperative surveillance and lesion assessment (Topic 1), surgical planning and therapeutic decision-making (Topic 2), psychological adjustment and anxiety mitigation (Topic 3), oncological risk awareness and preventive behavior (Topic 4), lactation management and breast health maintenance (Topic 5), targeted therapy and endocrine treatment protocols (Topic 6), and clinical communication and patient–provider engagement (Topic 7).
Clustering results and distribution
To identify distinct functional types of ABKOLs, K-means clustering was applied based on each account's content distribution across the seven LDA topics. Prior to determining the number of clusters, two commonly used diagnostic toolsning and therapeutic decision-making (Topic 2),re employed to evaluate clustering validity. As shown in Figure 1, the silhouette coefficient peaked at four clusters, indicating an optimal balance between intracluster cohesion and intercluster separation. Figure 2 illustrates the elbow method, where a noticeable inflection at k = 4 further supports the choice of four clusters as the most appropriate solution for partitioning ABKOLs.

Silhouette coefficient method for optimal k.

Elbow method for optimal k.
Based on these results, 30 ABKOLs were categorized into four distinct clusters, each characterized by unique thematic focus and functional orientation. The detailed content distribution across the seven topics for each ABKOL is presented in Table 1, which summarizes the average topic probability distributions and final clustering assignments. Cluster 1 (
Content probability distribution of ABKOLs across topics.
ABKOL: authoritative breast cancer–related key opinion leaders.
Platform-specific distribution of ABKOLs
The distribution of ABKOLs across Douyin and Xiaohongshu is shown in Figure 3. Notably, the four ABKOL clusters demonstrate significant distributional differences between the two platforms. Cluster 1 (preventive advocates) is predominantly found on Xiaohongshu, with nine ABKOLs from this platform and only three from Douyin. This may be attributed to Xiaohongshu's emphasis on text-image formats, which are well-suited for detailed dissemination of professional knowledge. In contrast, Cluster 2 (therapeutic communicators) is overwhelmingly represented on Douyin, comprising 14 ABKOLs from Douyin and only 1 from Xiaohongshu. These ABKOLs tend to favor short video formats for delivering concise and accessible health information, aligning with Douyin's fast-paced and lightweight content ecosystem. Cluster 3 (health promoters) and Cluster 4 (supportive companions) are exclusively composed of Douyin-based ABKOLs, with 1 and 2 members, respectively. No representatives from these clusters were identified on Xiaohongshu. This suggests that emotionally driven, interactive, or companionship-oriented content is more compatible with Douyin's audiovisual environment, which facilitates expressive communication and fosters a sense of everyday intimacy with followers. Overall, Xiaohongshu tends to attract ABKOLs with a strong professional orientation and a preference for structured health discourse, whereas Douyin accommodates a more diverse array of roles, particularly those centered on practical guidance and emotional support. These platform-specific dynamics not only shape the presentation strategies adopted by ABKOLs but may also influence user engagement patterns and emotional feedback mechanisms.

Distribution of authoritative breast cancer–related key opinion leaders (ABKOLs).
Representative content analysis of each ABKOLs
Cluster 1: Preventive advocates
This group of ABKOLs, as illustrated in Figure 4, primarily focused on the theme of oncological risk awareness and preventive behavior, while also covering topics such as psychological adjustment and anxiety mitigation and targeted therapy and endocrine treatment protocols. The overall content orientation was strongly health-educational, characterized by a preventive mindset and long-term disease management perspective. These ABKOLs frequently adopted conversational and scenario-based video formats to disseminate knowledge related to early breast cancer screening, recurrence surveillance, and medication adherence.

Theme distribution of cluster 1.
On Xiaohongshu, accounts such as ABKOL_01 and ABKOL_02 often employed formats such as “outpatient dialogues” or “casual doctor-patient talk” to foster relatability and clarity. For example, ABKOL_01 repeatedly emphasized statements such as “
On Douyin, ABKOL_25 focuses more on breast disease education and medication management. Their content is characterized by logical clarity and a professional tone, often addressing common patient questions, such as “
Collectively, preventive advocates demonstrated a dual emphasis on standardized medical information and individualized emotional support. By combining accessible communication strategies with empathetic narratives, they fostered trust and promoted a proactive view of breast cancer as a controllable and preventable condition. Within the digital health communication ecosystem, they functioned as both science communicators and preventive health advocates, shaping public awareness and encouraging responsible health behaviors.
Cluster 2: Therapeutic communicators
This cluster of ABKOLs primarily focused on two core themes: surgical planning and therapeutic decision-making as well as targeted therapy and endocrine treatment protocols. The distribution of these thematic emphases is depicted in Figure 5. Their content demonstrated a strong emphasis on the clinical decision-making process, including surgical interventions, postoperative care, and pharmacological management. By elucidating medical procedures and therapeutic mechanisms, these ABKOLs aimed to improve patient comprehension and support informed choices among diverse treatment modalities.

Theme distribution of cluster 2.
On Xiaohongshu, ABKOL_04 integrated scientific literacy with a humanistic narrative style. The videos adopted a patient-centered perspective, incorporating empathetic expressions such as “
On Douyin, ABKOL_18 and ABKOL_28 represented a more technical and specialized communication style. Their content frequently addressed questions such as “
In summary, this group of ABKOLs provided comprehensive coverage across the breast cancer treatment trajectory, spanning preoperative evaluation, surgical planning, and postoperative management. Their communication style merged empathy with professionalism, offering structured and accessible explanations that met both informational and emotional needs. By functioning as clinical knowledge translators and therapeutic consultants, these ABKOLs played a critical role in fostering informed decision-making and enhancing public literacy in breast cancer treatment pathways.
Cluster 3: Health promoters
As shown in Figure 6, this cluster of ABKOLs centered on lactation management and breast health maintenance, as well as oncological risk awareness and preventive behavior. Their content focused on health behavior advocacy and everyday breast self-management practices. Compared with other clusters, this group featured a relatively low proportion of treatment-related topics. Instead, their videos prioritized audience-centered communication, using clear and intuitive language to disseminate basic health information and encourage proactive engagement in breast health.

Theme distribution of cluster 3.
A notable exemplar of this category was ABKOL_23 on the Douyin platform. Their videos adopted an informal and approachable style, often capturing viewer interest through interactive questions. In response to inquiries such as “
In terms of communication style, ABKOL_23's content reflected an interview-based and interactive approach, characterized by conversational and content-rich language. Frequent references to terms such as “
Overall, health promoters play a pivotal role in raising awareness and shaping health behaviors among the general public. Rather than delivering complex clinical information, they share accessible, actionable messages about daily lifestyle practices. Through approachable communication strategies, they promote an inclusive vision for breast health management that is participatory and empowering for diverse audiences.
Cluster 4: Supportive companions
This group of ABKOLs was distinguished by a pronounced focus on postoperative surveillance and lesion assessment and psychological adjustment and anxiety mitigation (Figure 7). Their content largely focused on postoperative recovery, the explanation of benign breast conditions, and emotional reassurance, emphasizing companionship and sustained health management in doctor–patient interactions. With a gentle and empathetic communication style, these ABKOLs prioritized trust-building and emotional resonance with patients, playing dual roles as “supporters” and “companions” in their content delivery.

Theme distribution of cluster 4.
ABKOL_27, active on the Douyin platform, exemplified this communicative archetype. Content was typically framed as question-and-answer exchanges or concise educational briefs. They frequently addressed questions such as “
ABKOL_30 emphasized the necessity of active postoperative surveillance and timely follow-up care. Their content covered key topics such as “
Collectively, ABKOLs in this category emphasized sustained monitoring, recovery reassurance, and empathetic dialog. By fostering trust and emotional stability, these ABKOLs provided companion-style support that extended beyond informational transmission, becoming vital touchpoints for recovery and psychological well-being in the digital health landscape.
Sentiment analysis across ABKOLs
A sentiment distribution analysis was conducted in this study across primary and secondary user comments on videos posted by the four ABKOL clusters. As illustrated in the 100% stacked bar charts (Figure 8), the analysis revealed marked differences in audience sentiment across comment levels, contingent upon the functional orientation of the ABKOLs.

Sentiment distribution across primary and secondary comments by authoritative breast cancer–related key opinion leader (ABKOL) cluster.
Positive sentiment predominated in primary comments across all ABKOL clusters, with notably high proportions observed in Cluster 3 (health promoters, 60.72%) and Cluster 1 (preventive advocates, 59.08%). This pattern suggests that emotionally engaging and educational content from these ABKOLs effectively fostered affirmative user responses during initial exposure. Nevertheless, a non-negligible proportion of negative sentiment was also observed in primary comments across all categories. For instance, negative sentiment reached 35.05% and 33.93% in Cluster 2 (therapeutic communicators) and Cluster 4 (supportive companions), respectively. Such negative responses may be attributed to unclear messaging, insufficient perceived professionalism, or disagreement among patient perspectives.
In secondary comments, sentiment composition shifted considerably, with a marked increase in neutral expressions. These neutral responses were primarily manifested as information-seeking questions, such as requests for professional interpretation of examination results, as well as objective, descriptive statements, for instance noting changes in BI-RADS classification, indicating a more cognitively oriented mode of engagement rather than explicit emotional expression. For example, neutral sentiment rose to 36.85% in Cluster 2 (therapeutic communicators) and 38.76% in Cluster 4 (supportive companions). This trend may reflect emotional rationalization during prolonged interaction or a cooling of emotional engagement caused by delayed responses from ABKOLs. In addition, positive sentiment declined across all categories in secondary comments, although Cluster 3 (health promoters) and Cluster 1 (preventive advocates) maintained relatively high levels at 49.78% and 47.33%, respectively. This decline indicates reduced capacity for sustaining positive emotional engagement during extended discourse.
Overall, Cluster 1 (preventive advocates) and Cluster 3 (health promoters) exhibited greater emotional consistency across both levels of comment interaction, whereas Cluster 2 (therapeutic communicators) and Cluster 4 (supportive companions) tended to elicit polarized or negative emotional responses. The observed increase in neutral sentiment within secondary comments may further reflect a transition toward more calm and rational user engagement in prolonged interaction chains.
Discussion
This study applied LDA topic modeling and K-means clustering to transcribed video content from Douyin and Xiaohongshu to categorize ABKOLs into four functional types: preventive advocates, therapeutic communicators, health promoters, and supportive companions. Each cluster was characterized by distinct semantic profiles, reflecting differences in language styles, thematic emphasis, and communication strategies. These findings suggest a dual-track communication approach that integrates professional knowledge dissemination with emotional resonance. In addition, the distribution of ABKOL types was influenced by platform-specific characteristics, while users’ emotional feedback patterns in the comment sections varied across content types.
From a semantic perspective, the four ABKOL categories exhibited unique communication styles and content strategies aligned with their respective functional orientations. Preventive advocates focused on long-term surveillance and recurrence risk, combining factual explanations with empathetic language to reduce patient anxiety. This is consistent with Peinado et al.'s cross-sectional study in the United States, which demonstrated that integrating emotional cues into factual health messages improves emotional engagement and dissemination impact. 33 Therapeutic communicators emphasized treatment planning, including surgical and pharmacological strategies, through structured and detailed explanations. This corresponds with findings from a U.S. cross-sectional survey showing that social media enhances cancer patients’ decision-making and fosters trust in treatment pathways. 34 Health promoters, by contrast, prioritized breast health behaviors, using accessible language and interactive formats to encourage self-examination and lifestyle modifications. A randomized controlled trial in Iran similarly found that short-form video interventions improved women's knowledge and attitudes toward breast self-examination, underscoring the value of approachable multimedia content. 35 Supportive companions centered on postoperative care and emotional reassurance, using gentle, empathic expressions to provide ongoing psychological support. This aligns with McKenzie et al.'s qualitative research, which found that young cancer patients relied on social media for emotional comfort and social belonging during recovery. 36 Furthermore, research indicates that breast cancer patients often rely on social media for emotional relief, experience exchange, and a sense of community during the postoperative period. This “digital companionship” serves as an important supplement to psychological interventions, particularly among supportive companions on Douyin, who frequently share personal recovery stories, respond to user emotions, and foster empathetic engagement. 37 Additionally, Rafiei et al. 38 found that among breast cancer patients, emotional support was valued more than informational support—especially when delivered via personalized video content. These findings emphasize the vital role of emotionally oriented ABKOLs in supporting psychological adaptation and long-term survivorship. Collectively, different ABKOL categories adopt distinct themes and communication styles, collectively forming a complementary digital framework that integrates health education and emotional support to enhance understanding, reassurance, and positive health behaviors in breast cancer care.
The study revealed that different functional types of ABKOLs exhibited distinct distribution patterns across platforms. Preventive advocates were primarily active on Xiaohongshu, whereas health promoters, therapeutic communicators, and supportive companions predominated on Douyin. These findings highlight not only variations in ABKOLs’ content expression but also the influential roles of recommendation algorithms, user structures, and communication preferences in shaping their expressive environments. This suggests that platforms are not neutral intermediaries but actively participate in structuring and deploying the functional roles of ABKOLs. Xiaohongshu, characterized by its community-driven, content-focused model, mainly attracts highly educated young women who favor structured, practical health information integrating expertise with daily scenarios. 39 Yu et al. 21 found that health posts with structured language and authoritative sources garnered higher engagement, aligning well with Xiaohongshu's cultural orientation and the credible and informative style of preventive advocates. By comparison, Douyin operates on an immersive, algorithm-driven logic that favors emotionally intense, fast-paced, and interactive content. A study by Li et al. 40 analyzing health videos on Douyin revealed that videos with intense emotional expression—such as anxiety, fear, or empathy—achieved greater views and shares, whereas neutral, rational content had lower dissemination performance. Similarly, research on educational videos in atrial fibrillation communities showed that high-arousal emotions—particularly fear and hope—was key to promoting user participation. Moreover, individual storytelling, such as patient narratives, can greatly enhance user empathy and their willingness to share. 41 Consequently, the emotionally charged and storytelling-oriented styles of health promoters, therapeutic communicators, and supportive companions are inherently more compatible with Douyin's algorithmic preferences.
Sentiment analysis revealed that preventive advocates and health promoters were more successful in eliciting positive emotional responses in both primary and secondary comments, likely due to their clear messaging and emotional resonance. Conversely, therapeutic communicators and supportive companions were more likely to trigger more negative or emotionally polarized reactions, particularly in primary comments, possibly due to users’ anxiety, comprehension difficulties, or conflicting opinions. Guntuku et al. found that emotionally expressive health content on Twitter mirrored community psychological needs, consistent with the emotional volatility observed around therapeutic communicators in our study. 42 A large-scale Instagram analysis further showed that posts blending emotional and informational support significantly increased user engagement, with emotional impact varying by account identity. 43 Notably, neutral sentiment rose markedly in secondary comments, indicating that user emotions tended to rationalize or cool down in deeper interactions. Similar findings emerged in studies on social media comment hierarchies, such as TED Talk comment threads on YouTube, where emotional neutrality increased with comment depth. 44 Yazdani et al. 45 analyzed narrative feedback from hospitalized cancer patients in Iran and noted a mix of neutral and negative sentiments in their evaluations, with emotional intensity diminishing and content becoming more informational at deeper discussion levels. This cooling trend may also stem from gaps in ABKOL responsiveness, where delayed or absent feedback discourages emotional arousal and leads users toward more neutral expressions. Garcia et al.'s experiment on public online discussion platforms demonstrated that while timely responses increased emotional arousal among low-arousal users, the absence of such feedback caused high-arousal users to shift toward emotional neutrality and reduce their future engagement, further supporting our findings. 46
In summary, this study identified four functionally distinct ABKOL categories using semantic topic modeling and clustering analysis. By integrating platform-specific dynamics and user sentiment from comments, it systematically characterized the dissemination patterns of each type. Notable differences emerged in content themes, communication styles, and interaction strategies, reflecting a dual-pathway model of professional information delivery and emotional engagement. Platform algorithms shaped these dynamics: Xiaohongshu favored structured, knowledge-based content, while Douyin prioritized emotionally expressive and interactive formats. Sentiment analysis further revealed distinct user response patterns across ABKOL types, highlighting variations in emotional resonance during information reception. Building on these findings, this study also provides practical implications for healthcare professionals involved in breast care communication. Preventive advocates may emphasize early screening and self-examination education, while therapeutic communicators can deliver concise, empathetic treatment information. Health promoters could focus on promoting healthy lifestyles, and supportive companions may strengthen emotional connection through interactive engagement. Tailoring content themes, platform choices, and communication tone to each functional type can help enhance both informational value and emotional resonance in digital health communication. Overall, this study offers a multidimensional understanding of ABKOL functions in digital health communication, enriches the typological framework for health communicators on social media, and provides actionable insights for optimizing breast cancer–related content strategies.
Limitations
This study has several limitations. First, the data sources were confined to two social media platforms, Douyin and Xiaohongshu, which may not comprehensively represent ABKOL communication behaviors across broader digital environments. Second, while the study categorized ABKOLs based on content themes, it did not fully consider how individual characteristics—such as professional credentials, institutional affiliation, or follower demographics—may also influence their communication approaches and audience perception. Third, the number of ABKOLs in certain clusters was relatively small, particularly in Cluster 3 (
Conclusions
Using LDA topic modeling and clustering analysis, this study identified four functionally distinct types of ABKOLs—preventive advocates, therapeutic communicators, health promoters, and supportive companions—and examined their platform-specific distributions and dissemination strategies across Douyin and Xiaohongshu. The findings revealed substantial heterogeneity in content focus, expression style, and user interaction across ABKOL types, leading to the formation of a multidimensional communication framework that integrates professional knowledge delivery with emotional support. Sentiment analysis further underscored the differential emotional impacts of ABKOLs, with preventive advocates and health promoters fostering more positive engagement, while therapeutic communicators and supportive companions elicited more emotionally polarized responses. Platform ecosystems played a moderating role in shaping these dynamics, influencing both content presentation and functional differentiation. This study provides a valuable reference for the classification and strategic optimization of ABKOLs in digital health communication and offers practical insights for tailoring personalized health education interventions to platform-specific contexts.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076251410043 - Supplemental material for Functional profiling of authoritative breast cancer–related key opinion leaders based on topic distributions: A comparative cluster analysis across social media platforms
Supplemental material, sj-docx-1-dhj-10.1177_20552076251410043 for Functional profiling of authoritative breast cancer–related key opinion leaders based on topic distributions: A comparative cluster analysis across social media platforms by Yiwen Duan, Qi Zhang, Yang Yang, Yajuan Weng, Tingting Cai and Changrong Yuan in DIGITAL HEALTH
Footnotes
Ethical approval and considerations
This study was conducted in accordance with the principles of the Declaration of Helsinki. All data were obtained from publicly available and verified accounts on Douyin and Xiaohongshu platforms. No private or personally identifiable information was collected, and all analyses were performed at an aggregated level. The data collection and processing procedures adhered to the platforms’ terms of service and relevant ethical guidelines for social media research.
Contributorship
YD contributed to conceptualization, methodology, data curation, formal analysis, visualization, and writing—original draft. QZ contributed to Data preprocessing, visualization, and writing—review & editing. YY contributed to Methodology and writing—review & editing. YW contributed to Literature review and writing—review & editing. TC contributed to Supervision, project administration, writing—review & editing, and funding acquisition. CY contributed to Supervision, study design guidance, writing—review & editing, and funding acquisition. TC and CY are joint corresponding authors. All authors reviewed and approved the final manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China, Ministry of Education of the People's Republic of China Humanities and Social Sciences Youth Foundation, (grant number Grant No. 72374048, 23YJC630002).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Data are available from the corresponding author on reasonable request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
