Sage Journals: Discover world-class research

Abstract

Background

Intracerebral hemorrhage (ICH) is the most devastating stroke subtype, characterized by high mortality and disability rates. With the rapid growth of short-video platforms, TikTok and BiliBili have become important channels for the public to obtain health information. However, the quality and reliability of ICH-related videos on these platforms have not been systematically evaluated.

Methods

Up to October 17–18, 2025, the research collected the top 100 comprehensively ranked videos from TikTok and BiliBili separately, using the Chinese term “脑出血 (ICH)” as the search keyword. After screening, 146 videos were included. Two independent reviewers assessed video quality and reliability using three standardized tools: the Global Quality Scale (GQS), the modified DISCERN (mDISCERN) instrument, and the JAMA benchmark. Correlation analysis was used to evaluate the relationship between video quality and engagement metrics.

Results

Videos on BiliBili scored significantly higher than those on TikTok on both the GQS and mDISCERN (P < 0.001 and P = 0.039, respectively). Furthermore, BiliBili outperformed TikTok in terms of content completeness. However, TikTok demonstrated significantly higher engagement metrics (likes, favorites, comments, and shares) than BiliBili (P < 0.01 for all). Videos uploaded by healthcare professionals achieved the highest quality, with a median GQS score of 3 (IQR 2-4). Correlation analysis revealed a positive correlation between video length and quality scores, while the number of comments was negatively correlated with both video quality and length.

Conclusion

The quality and reliability of ICH-related videos were superior on BiliBili compared to TikTok, whereas TikTok exhibited overwhelming advantages in user engagement. Longer video duration was associated with better quality and reliability. Although videos from healthcare professionals scored higher in quality and reliability than those from individual users, they still did not meet the standard for high-quality information. This indicates the ongoing challenge of effectively translating specialized medical knowledge into reliable, practical, and easily understandable information for the public.

Keywords

ICH social media online videos information quality TikTok BiliBili

Introduction

According to the latest Global Burden of Disease (GBD) study, stroke remains the second leading cause of death worldwide.¹ Intracerebral hemorrhage (ICH), although accounting for only 20% of stroke cases, is the most devastating stroke subtype globally, characterized by the highest rates of disability and mortality. Its sudden onset and critical nature impose a heavy burden on patients’ families and the healthcare system.² Despite continuous advancements in treatment, the prognosis for ICH remains poor, with a significant proportion of survivors left with severe neurological deficits. This not only severely compromises the patient’s quality of life but also places long-term and substantial economic and non-economic pressure on their families and the entire healthcare system.³ In this context, effective prevention, timely recognition, and scientific long-term management are crucial for improving ICH outcomes. This largely depends on the public’s accurate knowledge and understanding of the disease. Therefore, enhancing public awareness of ICH warning signs, risk factors, and emergency response is of paramount importance for reducing treatment delays and improving clinical outcomes.

In recent years, the widespread adoption of the internet and social media has fundamentally transformed how the public accesses health information.⁴ Among these, video platforms have emerged as prominent forces due to their intuitive, engaging, and easily disseminable nature, becoming a key information source for the public, particularly younger generations.⁵ As a convenient and efficient means of health communication, online science popularization education helps improve patients’ knowledge about their diseases and promotes their active participation in disease management.⁶ This indicates that online video-based science popularization, characterized by the integration of audio and visual elements, contextual presentation, and repetitive learning, can convey disease-related knowledge more intuitively, thereby exerting greater value in health education within clinical practice. According to data, China’s internet user base has reached 1.123 billion, with short-video users accounting for over 95% (1.068 billion). TikTok and BiliBili, as the two largest platforms in China by user volume, cover all age groups and are highly representative. Among them, TikTok’s total user base has exceeded 1.62 billion, with monthly active users approaching 1 billion and daily active users exceeding 600 million.⁷ Bilibili has 107 million daily active users and 348 million monthly active users.⁸ With distinct strengths including easy access, straightforward visual display, and swift information spread, these two platforms serve as vital instruments for delivering public health education. Previous international studies have shown that visual health education methods such as short videos also hold significant health communication value worldwide. For example, a relevant study by the University of California in health science popularization found that, compared with patients who only read text information, those who watched video-based education reported significantly higher perceived effectiveness of opioid dose reduction (4.06 vs. 3.67, P < 0.001).⁹ Therefore, video-based health communication can have practical value across different countries and cultural contexts; this provides the background for our further evaluation of the quality of health information on video platforms. While these platforms offer unparalleled advantages in the breadth and speed of information dissemination, this very advantage carries significant hidden risks: the low barrier to content creation results in highly variable quality, with no guarantee of accuracy, thereby exposing the public to the dual risks of misinformation and disinformation.¹⁰ For critical emergencies like ICH, judgments based on incomplete or erroneous information can lead to delayed medical care, inappropriate self-management, or unrealistic expectations about prognosis, with potentially dire consequences.^11,12

Currently, numerous studies have focused on assessing the quality of various health topics on social media, ranging from chronic degenerative and autoimmune diseases to cardiovascular diseases and even cancer, confirming that short-video platforms have become a major channel for public health information.^13–16 However, there is a lack of specialized, systematic evaluation of the information quality specifically for ICH, a critical neurological emergency, on these mainstream video platforms. Furthermore, most existing studies are limited to single-platform analyses, failing to reveal the differential impact of various platforms on information quality through cross-platform comparison.

This study is the first to evaluate ICH-related content on two major Chinese video platforms, TikTok and BiliBili. We conducted a comprehensive assessment using three recognized instruments—the Global Quality Scale (GQS), the modified DISCERN (mDISCERN), and the JAMA benchmark criteria—to examine ICH videos from multiple dimensions, including informational completeness, scientific reliability, and content standardization.^17–19 By analyzing platform characteristics, content creator backgrounds, and dissemination effects, this study aims to provide an evidence-based foundation for optimizing ICH-related health information dissemination.

Methods

Video retrieval and screening strategy

This study was conducted between October 17 and 18, 2025, on two leading short-video platforms: TikTok and BiliBili. To simulate the search behavior of an average user and minimize bias from personalized algorithmic recommendations, we used newly registered platform accounts. All searches were performed after clearing browser cache and history. To ensure the reproducibility of the study, we employed the default comprehensive ranking filters on both platforms. These filters rank search results based on relevance, engagement, and platform-specific weighted parameter combinations. The Chinese keyword “脑出血 (ICH)” was used for searching on both platforms, using the default “comprehensive ranking” filter. The top 100 videos from each platform’s search results were recorded. The exclusion criteria for videos were: (1) duplicates; (2) irrelevant content: Those videos that contain actual content unrelated to ICH but were still retrieved; (3) advertisements; (4) videos without sound or containing only background music but no substantive content; (5) videos published within the last 7 days; (6) not informative: The theme of the video is ICH, but the content does not contain any substantive health education information about ICH. The final screening process is illustrated in Figure 1.

Figure 1.

Flowchart of the search and screening strategy for ICH-related videos.

Data extraction and classification

Basic characteristics and engagement metrics (video duration, number of likes, favorites, comments, and shares) were extracted from each eligible video. Furthermore, uploaders were categorized into three groups based on their self-declared and platform-verified information: Specialized Healthcare Professionals (SHPs, e.g., certified neurologists/neurosurgeons), Non-Specialized Healthcare Professionals (NSHPs, e.g., nurses, other healthcare workers not specialized in neurology), and Individual Users (persons without any healthcare background certification). Additionally, we assessed whether the video content covered key clinical elements, including: epidemiology, etiology, clinical manifestations, diagnosis, treatment, prevention, and prognosis. The mention of each element was recorded as “yes” or “no”.

Quality and reliability assessment

Video quality and reliability were independently assessed by BSH and AQC. To ensure rigor and objectivity, three internationally validated instruments were employed: First, the GQS score was used to evaluate the overall information flow, clarity, and patient usefulness of the videos (Table 1). Second, the mDISCERN score instrument was applied to assess information reliability (Table 2). Furthermore, the JAMA score (Table 3) was used to measure whether the health information adhered to fundamental principles of credibility. Discrepancies arising from independent assessments were first resolved through discussion to reach a consensus. If a disagreement persisted, a third clinical expert (QWH) made the final decision. To facilitate readers’ comprehension of the distinctions among GQS, mDISCERN, and JAMA in evaluating content, we prepared Table 4. Given that the intraclass correlation coefficient (ICC) is suitable for comparing continuous rating data and can simultaneously account for inter-rater agreement and scoring reliability, this study also conducted a corresponding analysis of inter-rater agreement using ICC.²⁰ ICC interpretation is as follows: excellent agreement (ICC > 0.9); good agreement (0.75–0.9); moderate agreement (0.5–0.75); poor agreement (<0.5). Inter-rater agreement was good: JAMA score (ICC = 0.852), GQS score (ICC = 0.956), and modified DISCERN score (ICC = 0.914); All three scoring systems demonstrated good inter-rater agreement, with GQS performing best, while mDISCERN and JAMA also reached levels from good to excellent.

Table 1.

The Global Quality Score (GQS) quality criteria.

Item features	Points
Poor quality; poor flow of the videos; most information missing; not at all useful for patients	1
Generally poor quality; some information listed, but many important topics missing; of very limited use to patients	2
Moderate quality; suboptimal flow; some important adequately discussed, but other information poorly discussed; somewhat useful for patients	3
Good quality and generally good flow; most of the relevant information listed, but some topics not covered; useful for patients	4
Excellent quality and flow; very useful for patients	5

Table 2.

The Modified DISCERN (mDISCERN) quality criteria.

Reliability score
1. Is the video clear, concise, and understandable?
2. Are valid sources cited?
3. Is the content presented balanced and unbiased?
4. Are additional sources of content listed for patient reference?
5. Are areas of uncertainty mentioned?

Table 3.

The Journal of the American Medical Association (JAMA) benchmark criteria.

Score	Score component
1 score	Authorship	Author and contributor credentials and their affiliations should be provided.
1 score	Attribution	Clearly lists all copyright information and states references and sources for content.
1 score	Currency	Initial date of posted content and subsequent updates to content should be provided.
1 score	Disclosure	Conflicts of interest, funding, sponsorship, advertising, support, and video ownership should be fully disclosed.

Table 4.

Comparison of health information quality evaluation tools.

Criterion	GQS	mDISCERN	JAMA
Primary Focus	Overall User Experience.	Reliability & Content Validity.	Transparency & Accountability.
Primary Focus	Evaluates the general flow, quality, and helpfulness of the content from a layperson’s perspective.	Evaluates the reliability of the medical information and whether it helps viewers make informed treatment choices.	Evaluates the credibility of the source rather than the content itself.
Accuracy Assessment	Indirect.	High.	None.
Accuracy Assessment	Does not verify medical facts directly; assumes that “high quality” flow implies accurate information.	Uses citations and clear aims as proxies for accuracy. It asks if the information is relevant and supported.	Does not evaluate the accuracy of the medical claims. A video can satisfy JAMA criteria but still contain incorrect advice.
Bias Detection	Low.	High.	Moderate.
Bias Detection	Does not have a specific metric to measure commercial bias or lack of balance.	Specifically includes a criterion to determine if the content is balanced and unbiased.	Checks for “Disclosure” (conflicts of interest/ownership), but does not assess editorial balance.
Transparency	Low.	Moderate.	High.
Transparency	Does not strictly require dates, authors, or references to achieve a high score.	Checks if sources are cited and if the publication date is clear.	The core function of the tool. Measures Authorship, Attribution, Disclosure, and Currency.
Ease of Use	High.	Moderate.	High.
Ease of Use	Typically a single 5-point Likert scale question. Extremely fast for reviewers to apply.	Usually a 5-question survey. Requires the reviewer to analyze the content deeply before scoring.	A simple checklist of 4 binary criteria (Present/Absent). Very quick to verify.
Subjectivity	High.	Moderate.	Low (Objective).
Subjectivity	Heavily dependent on the reviewer’s personal opinion and impression of the video/article.	Guided by specific questions, but still requires the reviewer to interpret “quality” and “balance.”	Least subjective. The criteria are factual and binary.

Statistical analysis

All statistical analyses were performed using R software (version 4.3.2). As continuous variables (e.g., video duration, engagement metrics) were non-normally distributed, data are described using medians and interquartile ranges (IQRs). The Mann-Whitney U test was used for comparisons between two groups, and the Kruskal-Wallis H test was used for comparisons among three or more groups. When a statistically significant difference was found in the Kruskal-Wallis H test, Dunn’s test was used for post-hoc pairwise comparisons. Categorical variables are presented as frequencies and percentages. Spearman’s rank correlation analysis was used to assess the relationships between video variables (e.g., duration, engagement metrics, and quality scores). A P-value of < 0.05 was considered statistically significant.

Results

General characteristics of videos

Using the keyword “脑出血” (ICH), we retrieved the top 100 videos from each platform, TikTok and BiliBili. After screening, 80 videos from TikTok and 66 from BiliBili were included in the final analysis (Figures 1 and 2). Key metrics collected included video duration, likes, favorites, comments, and shares (Table 5). Among the 146 videos analyzed, the median video length was 143.50 seconds. The median number of likes was 3,779, favorites was 1,237, comments was 254.5, and shares was 394.

Figure 2.

Distribution of included ICH-related videos across platforms.

Table 5.

General characteristics, quality, and reliability of the videos.

Variables	Total (n = 146)
General information
Video length(s),M (Q1,Q3)	143.50 (69.25, 301.75)
Likes,M (Q1,Q3)	3779.00 (184.50, 33408.50)
Collections,M (Q1,Q3)	1237.00 (182.75, 7353.75)
Comments,M (Q1,Q3)	254.50 (20.00, 1644.25)
Shares,M (Q1,Q3)	394.00 (38.25, 10726.00)
Video content
Epidemiology	20 (13.6%)
Etiology	66 (45.2%)
Clinical manifestation	66 (45.2%)
Diagnosis	47 (32.1%)
Treatment	78 (53.4%)
Prevention	54 (36.9%)
Prognosis	49 (33.5%)
Video quality
GQS score,M (Q1,Q3)	3.00 (1.00, 3.00)
mDISCERN score,M (Q1,Q3)	2.00 (2.00, 2.00)
JAMA score, M (Q₁, Q₃)	2.00 (2.00, 2.00)

Content completeness was assessed by evaluating the coverage of key elements: epidemiology, etiology, clinical manifestations, diagnosis, treatment, prevention, and prognosis. Out of the 146 videos, 20 (13.6%) covered epidemiology, 66 (45.2%) covered etiology, 66 (45.2%) covered clinical manifestations, 47 (32.1%) covered diagnosis, 78 (53.4%) covered treatment, 54 (36.9%) covered prevention, and 49 (33.5%) covered prognosis.

Video quality and reliability were assessed using the GQS, mDISCERN, and JAMA scores (Table 5). The median scores for the overall pool of videos were GQS: 3 (IQR 1-3), mDISCERN: 2 (IQR 2-2), and JAMA: 2 (IQR 2-2), indicating generally moderate to low quality and reliability across both platforms.

Analysis of video characteristics by platform

Significant differences were observed between videos on TikTok and BiliBili (Table 6). The median video duration on BiliBili (308 seconds) was significantly longer than on TikTok (100.5 seconds) (P < 0.001). Conversely, TikTok videos demonstrated significantly higher median engagement metrics across all measures: likes (28,076.5 vs. 176.5; P < 0.001), favorites (5,173.5 vs. 186; P < 0.001), comments (1,192.5 vs. 15; P < 0.001), and shares (6,907 vs. 38.5; P < 0.001).

Table 6.

General information, quality, and reliability scores of ICH videos on TikTok and BiliBili.

Variables	BiliBili (n = 66)	TikTok (n =80)	P
General information
Video length(s),M (Q1,Q3)	308.00 (110.75, 656.00)	100.50 (35.25, 158.00)	<.001
Likes,M (Q1,Q3)	176.50 (51.50, 696.50)	28076.50 (12434.00, 72224.25)	<.001
Collections,M (Q1,Q3)	186.00 (51.50, 436.75)	5173.50 (1821.00, 22572.00)	<.001
Comments,M (Q1,Q3)	15.00 (3.00, 72.50)	1192.50 (516.25, 5081.50)	<.001
Shares,M (Q1,Q3)	38.50 (13.00, 99.50)	6907.00 (865.25, 36514.50)	<.001
Video content
Epidemiology	9 (13.6%)	11 (13.8%)	-
Etiology	32 (48.5%)	34 (42.5%)	-
Clinical manifestation	32 (48.5%)	34 (42.5%)	-
Diagnosis	29 (43.9%)	18 (22.5%)	-
Treatment	48 (72.7%)	30 (37.5%)	-
Prevention	48 (72.7%)	36 (45.0%)	-
Prognosis	36 (54.6%)	19 (23.8%)	-
Video quality
GQS score,M (Q1,Q3)	3.00 (2.00, 3.00)	2.00 (1.00, 3.00)	<.001
mDISCERN score,M (Q1,Q3)	2.00 (2.00, 2.00)	2.00 (2.00, 2.00)	0.039
JAMA, M (Q₁, Q₃)	2.00 (2.00, 2.00)	2.00 (2.00, 2.00)	0.177

Content completeness also varied by platform (Table 6, Figure 3). A similarly low proportion of videos covered epidemiology on both BiliBili (13.6%) and TikTok (13.8%). The coverage of etiology (48.5% vs. 42.5%) and clinical manifestations (48.5% vs. 42.5%) was comparable between BiliBili and TikTok, respectively. However, BiliBili had a significantly higher proportion of videos covering diagnosis (43.9% vs. 22.5%), treatment (72.7% vs. 37.5%), prevention (72.7% vs. 45%), and prognosis (54.6% vs. 23.8%) than TikTok, indicating more comprehensive content on BiliBili.

Figure 3.

Content coverage of ICH-related videos on TikTok and BiliBili.

Quality and reliability assessments by platform are detailed in Table 6 and Figure 4. BiliBili videos achieved significantly higher median GQS scores (3, IQR 2-3) compared to TikTok videos (2, IQR 1-3) (P < 0.001) (Figure 4(a)). Similarly, the mDISCERN score was significantly higher for BiliBili than for TikTok (P = 0.039) (Figure 4(b)). In contrast, no significant difference was found in the JAMA scores between the two platforms (P = 0.177) (Figure 4(c)). These results suggest that the quality and reliability of ICH-related videos are superior on BiliBili compared to TikTok.

Figure 4.

Quality and reliability assessments of ICH-related videos on TikTok and BiliBili using GQS, mDISCERN, and JAMA scores.(A) Violin plot of GQS scores. (B) Violin plot of mDISCERN scores. (C) Violin plot of JAMA scores. (*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001).

Analysis of video characteristics by uploader

Significant differences were also found among videos from different uploader categories: SHPs, NSHPs, and Individual Users. The distribution of these uploader categories was similar across BiliBili and TikTok (Figure 5(a) and (b)). Of the 146 videos, 52 were uploaded by SHPs, 52 by NSHPs, and 42 by Individual Users (Table 7).

Figure 5.

Proportion of video uploaders categorized as SHPs, NSHPs, and Individual Users on TikTok and BiliBili. (A) Pie chart showing the overall proportion of uploader categories (SHPs, NSHPs, and Individual Users). (B) Percentage stacked bar chart comparing the distribution of uploader categories between TikTok and Bilibili.

Table 7.

Characteristics, quality, and reliability of ICH videos by different uploaders on TikTok and BiliBili.

Variables	Total (n = 146)	Individual users (n = 52)	SHPs (n = 52)	NSHPs (n = 42)	P
Video length, M (Q₁, Q₃)	143.50 (69.25, 301.75)	111.00 (41.50,229.75)	197.50 (110.00,573.50)	108.00 (86.25,207.00)	0.012
Likes, M (Q₁, Q₃)	3779.00 (184.50, 33408.50)	3063.00 (599.75,24966.00)	7543.50 (112.50,39276.00)	1651.50 (170.75,41320.00)	0.995
Collections, M (Q₁, Q₃)	1237.00 (182.75, 7353.75)	446.00 (79.75,2449.50)	2701.50 (190.75,16436.00)	1385.00 (246.50,17542.25)	0.006
Comments, M (Q₁, Q₃)	254.50 (20.00, 1644.25)	469.50 (91.25,1927.75)	223.00 (3.00,1441.00)	93.00 (20.00,1284.75)	0.058
Shares, M (Q₁, Q₃)	394.00 (38.25, 10726.00)	313.50 (23.75,3872.25)	1041.00 (51.25,11831.75)	298.50 (69.00,16677.25)	0.271
GQS score, M (Q₁, Q₃)	3.00 (1.00, 3.00)	1.00 (1.00,2.00)	3.00 (2.00,4.00)	3.00 (3.00,3.00)	<.001
mDISCERN score, M (Q₁, Q₃)	2.00 (2.00, 2.00)	2.00 (2.00,2.00)	2.00 (2.00,2.00)	2.00 (2.00,2.00)	0.003
JAMA score, M (Q₁, Q₃)	2.00 (2.00, 2.00)	2.00 (2.00,2.00)	2.00 (2.00,2.00)	2.00 (2.00,2.00)	0.184

A significant difference in video duration was observed among the three groups (Table 7). Videos from SHPs had the longest median duration (197.5 seconds), followed by Individual Users (111 seconds) and NSHPs (108 seconds). Regarding engagement, videos from SHPs received the highest median numbers of likes (7,543.5), favorites (2,701.5), and shares (1,041).

The quality and reliability scores by uploader category are presented in Table 7 and Figure 6. For the GQS score (Table 7, Figure 6(a) and (b)), videos from SHPs had the highest median score (3, IQR 2-4), followed by NSHPs (3, IQR 3-3) and Individual Users (1, IQR 1-2). The GQS scores for both SHPs and NSHPs were significantly higher than for Individual Users, but no significant difference was found between SHPs and NSHPs. For the mDISCERN score (Table 7, Figure 6(c) and (d)), the median score was 2 (IQR 2-2) for all three groups; however, the scores for both SHPs and NSHPs were statistically significantly higher than for Individual Users, with no significant difference between the two professional groups. For the JAMA score (Table 7, Figure 6(e) and (f)), the median score was 2 (IQR 2-2) for all three groups, with no statistically significant differences among them.

Figure 6.

Quality and reliability assessments of ICH-related videos by uploader category using GQS, mDISCERN, and JAMA scores. (A) Ridge plot of GQS scores across uploader categories (Overall, SHPs, NSHPs, and Individual Users). (B) Ridge plot of mDISCERN scores across uploader categories. (C) Ridge plot of JAMA scores across uploader categories. (*P < 0.05,**P < 0.01, ***P < 0.001,****P < 0.0001).

Correlation analysis

We performed a correlation analysis between basic video characteristics/engagement metrics (duration, likes, favorites, comments, shares) and quality/reliability indicators (GQS, mDISCERN, JAMA scores) across the 146 videos (Figure 7). Strong positive correlations were observed among likes, favorites, comments, and shares. Video duration showed low-to-moderate positive correlations with both the GQS and mDISCERN scores. Conversely, the number of comments exhibited negative correlations with both video duration and the GQS score.

Figure 7.

Heatmap of correlation analysis between video characteristics, engagement metrics, and quality/reliability scores. The meaning of ×: The P value is not less than 0.05, which indicates that there is no statistical significance.

Discussion

In this cross-sectional assessment of ICH-related videos on TikTok and BiliBili, our study revealed that: (a) the overall quality and reliability scores (GQS, mDISCERN, JAMA) were low; (b) compared to TikTok, videos on BiliBili were significantly longer and provided more comprehensive coverage of key clinical elements such as diagnosis, treatment, prevention, and prognosis; however, TikTok demonstrated superior performance in user engagement metrics including likes, favorites, comments, and shares; (c) videos created by SHPs received higher quality scores and better audience feedback, indicating that the professionalism and credibility of the content are key factors in enhancing the effectiveness of public health education; (d) strong correlations were observed among various engagement metrics; video length showed low-to-moderate positive correlations with GQS and mDISCERN scores, while the number of comments was negatively correlated with both video length and the GQS score.

Data from the 146 videos indicate that the significantly longer duration of videos on BiliBili reflects its user base’s higher acceptance of in-depth, systematic content. The platform’s ecosystem is more conducive to explaining complex health knowledge.²¹ In terms of user engagement, however, TikTok’s metrics were orders of magnitude higher, with likes, favorites, comments, and shares far exceeding those on BiliBili. This discrepancy is largely attributable to the distinct positioning and algorithms of the two platforms. BiliBili was originally designed for knowledge-oriented content, and its user base includes more students and medical students. Its recommendation system prioritizes educational and in-depth videos. In contrast, TikTok’s algorithm promotes fast-paced short videos, and its distribution model, heavily based on engagement metrics like likes and watch time, typically favors eye-catching content over purely informational material.²² This suggests a significant trade-off between informational depth/completeness and dissemination breadth/engagement. Well-produced, comprehensive long-form videos (as found on BiliBili), while capable of providing broader coverage of clinical elements and helping to establish professional authority, face limitations in their dissemination potential and user interaction levels, making it difficult to achieve widespread, immediate engagement. Conversely, concise short-form content (as found on TikTok), although often compromising on depth and comprehensiveness when dealing with complex medical topics, holds a distinct advantage in stimulating user interaction and achieving viral spread.²³ In addition, BiliBili’s search and recommendation algorithm might perform less effectively than TikTok’s, as our retrieval process revealed that BiliBili’s comprehensive sorting yielded a higher proportion of videos unrelated to ICH (e.g., on ischemic stroke, myocardial infarction) compared with TikTok, and such irrelevant videos were excluded.

Our systematic analysis of content coverage related to ICH on TikTok and BiliBili revealed significant disparities in the coverage of different clinical elements, highlighting structural shortcomings in current health education content. The most prominent issue identified was the severe neglect of epidemiological information regarding ICH. Among the 146 videos, only approximately 13.6% mentioned relevant epidemiological data, a pattern consistently observed across both TikTok and BiliBili platforms.^24,25 Epidemiological information, such as disease incidence and high-risk groups (e.g., specific age ranges, individuals with hypertension or diabetes), is fundamental for fostering public risk awareness and enabling preliminary self-assessment. The widespread absence of this content may prevent the public, particularly high-risk individuals, from accurately evaluating their personal risk. This could lead to the neglect of necessary preventive measures and early warning signs, potentially delaying timely medical intervention.²⁶ Furthermore, coverage of other critical aspects—diagnosis, prevention, and prognosis—was also insufficient overall, with notable disparities between platforms. Across all 146 videos, the proportions covering diagnosis (32.1%), prognosis (33.5%), and prevention (36.9%) each remained below 40%. This lack of information can have serious consequences: insufficient knowledge about diagnosis may prevent the public from understanding the importance of essential diagnostic tools (such as CT scans); a scarcity of prognostic information can distort patient and family expectations regarding recovery and long-term management, potentially leading to unnecessary anxiety or unrealistic treatment outcome expectations; and weak coverage of prevention—a cornerstone of stroke management—directly impacts the effectiveness of primary prevention efforts.²⁷ It is noteworthy that BiliBili demonstrated significantly higher coverage of in-depth information on diagnosis, treatment, prevention, and prognosis compared to TikTok. This aligns with its longer video format, which is more conducive to systematic explanation. However, TikTok’s comparative weakness in these critical clinical areas, despite its superior reach and engagement, implies that the information flow reaching the broadest audience may lack sufficient depth and practical utility. This disconnect between “breadth” and “depth” risks rendering widely disseminated content ineffective in translating into actionable knowledge and improved health literacy for public decision-making.²⁸

Our study also systematically evaluated ICH-related videos from both platforms using the GQS, mDISCERN, and JAMA instruments. Overall, the quality and reliability scores for all videos were moderate to low. Low scores on GQS, mDISCERN, and JAMA indicate issues such as “insufficient source citation, incomplete logical structure, and weak evidence chains,” which are particularly dangerous for an acute and severe condition like ICH. This suggests that the overall quality of ICH-related health information on social media still requires significant improvement, with urgent need to enhance its accuracy, comprehensiveness, and reference value.²⁹ However, significant differences in quality and reliability scores were observed between the platforms. Videos on BiliBili demonstrated significantly higher quality and reliability than those on TikTok: the median GQS and mDISCERN scores were both higher on BiliBili, indicating superior performance in information flow completeness, patient usefulness, and content reliability. This finding is consistent with similar studies on other diseases.^30,31 This disparity likely stems primarily from the distinct content ecosystems and user consumption habits of the two platforms. First, as mentioned previously, BiliBili allows and encourages longer video content. This provides creators, particularly SHPs, with sufficient space to explain the complex medical knowledge of ICH systematically and in-depth, covering aspects such as diagnostic basis, treatment options, and prognosis management. This is corroborated by the significantly higher coverage of key clinical elements like diagnosis, treatment, prevention, and prognosis in BiliBili videos. Second, BiliBili’s community culture has a stronger “knowledge-oriented” characteristic, where users have higher acceptance and demand for deep, professional content, thereby incentivizing creators to produce higher-quality videos. Finally, BiliBili’s user base likely includes a higher proportion of students, including medical students, making them a more receptive audience for sufficiently long and comprehensive ICH videos. In contrast, TikTok’s core strengths lie in high interactivity and powerful dissemination capabilities. However, its short-video format and algorithm that prioritizes immediate engagement may be structurally disadvantageous for the complete presentation of complex medical information. To capture viewer attention within a short timeframe, content often tends to focus on more visually impactful or dramatic aspects, such as symptom descriptions, at the expense of in-depth information requiring rigorous explanation, like diagnosis and prognosis.³²

This study categorized video sources into three groups based on uploader identity: SHPs, NSHPs, and individual users. Our analysis revealed significant differences in the quality and reliability of ICH-related videos produced by different uploader groups. First, videos uploaded by SHPs demonstrated significantly superior quality compared to the other two categories. Specifically, these videos received the highest GQS scores, reflecting better performance in information flow, clarity, and patient usefulness. This finding aligns with expectations, as SHPs possess systematic medical knowledge and clinical experience that enables them to ensure the accuracy and scientific rigor of health education content.^33,34 A notably positive finding is that SHP-created videos also achieved the highest interaction metrics (likes, favorites, and shares). This suggests that in the serious medical domain of ICH, the public can recognize and shows a preference for trusting and disseminating content from professional sources, creating a virtuous cycle between professionalism and dissemination reach. However, our results also reveal a crucial issue: despite the relative superiority of SHP content, their GQS scores remained only at a moderate level, and all uploader groups demonstrated generally low mDISCERN and JAMA scores. This indicates that even content created by professionals has substantial room for improvement in terms of information reliability, citation of references, avoidance of bias, and content comprehensiveness. Previous research has similarly identified this problem.^35,36 The most plausible explanation is that professionals may compromise informational rigor and depth to adapt to the characteristics of short-video platforms. Given the current popularity of short-form videos and the high interactivity of concise content, medical and public health professionals need to achieve broader reach without compromising the depth of information. SHPs might be able to use short and engaging videos as entry points to effectively capture viewers’ attention, while directing those seeking more detailed information via embedded links to longer, more comprehensive content published on other platforms. In addition, to improve the quality of videos produced by healthcare professionals, we suggest that SHPs adopt specific, actionable strategies such as: (1) clearly disclosing their professional qualifications and the educational purpose of the content at the beginning of the video; (2) citing authoritative guidelines or peer-reviewed evidence when presenting treatment recommendations or disease information; (3) following standardized formats for health information delivery to reduce ambiguity and enhance consistency; and (4) collaborating with professional associations to develop checklists or templates that help ensure completeness and accuracy of content. Such practices would not only raise the quality of professional videos but also make them more trustworthy and accessible to the public.

Correlation analysis of metrics including video length, likes, favorites, comments, and shares across the 146 videos revealed complex yet instructive relationships among the indicators for ICH-related videos, providing important insights into the dissemination patterns and quality representation of health information on social media. First, our correlation results showed strong positive correlations among engagement metrics such as likes, favorites, comments, and shares. However, no significant correlation was observed between these engagement metrics and quality/reliability scores. This suggests that these metrics collectively reflect a video’s “popularity” or “dissemination reach,” but they do not necessarily serve as direct proxies for its intrinsic informational quality.^37,38 A video that triggers strong emotional responses or interest in users is likely to see simultaneous increases across all engagement indicators. Second, a key finding was the low-to-moderate positive correlation between video length and both GQS and mDISCERN scores, a result consistent with previous findings reported by Lei et al.^39,40 This strongly indicates that longer duration provides the necessary space for systematically and thoroughly explaining complex medical knowledge, thereby contributing to enhanced content comprehensiveness and reliability. This finding corroborates the superior quality observed with longer videos on BiliBili, underscoring the value of “depth” in health education. However, a noteworthy and cautionary phenomenon was the negative correlation between the number of comments and both video length and GQS score. This may imply that lengthy or overly specialized videos might fail to engage broader viewership and could potentially suppress users’ immediate inclination to comment, possibly because viewers require time to digest the information or feel less capable of participating in in-depth discussions. Furthermore, a more critical interpretation is that a high number of comments may not signal high quality but could instead indicate contentious content. Videos that generate numerous comments might contain controversial viewpoints, unsubstantiated claims, or oversimplified conclusions, thereby provoking emotional debates or panic-driven inquiries—content characteristics typically contrary to high-quality, rigorous, and objective health education.

We suggest TikTok refine its algorithm to give prominent placement in search results to health education videos produced by verified medical practitioners; concurrently, the platform ought to support and incentivize creators to make longer videos that offer more structured and thorough health education content. Meanwhile, BiliBili should further consolidate its strength in in-depth explanations and build a systematic knowledge repository. Content creators, particularly SHPs, need to focus on addressing critical content gaps such as epidemiological background, diagnostic basis, and prognosis management. For viewers/patients, we recommend selecting longer, more comprehensive videos released by SHPs to acquire disease-related knowledge.

While this study provides valuable insights into the quality and dissemination patterns of ICH-related information on short-video platforms, several limitations must be acknowledged. First, the data were sourced by sampling the top 100 videos from each platform at a specific point in time (October 17-18, 2025). This approach may not fully capture all relevant content on the platforms, and the results are susceptible to the dynamic nature of platform-specific recommendation algorithms and content updates. Second, in Chinese, “脑出血” is the more normative and commonly used medical term, while “脑溢血”(ICH) was formerly a colloquial label employed by a limited group and is seldom referenced today. “脑出血” is now broadly utilized as the Chinese shorthand for ICH across lay and professional medical discourse. By employing the Chinese search term “脑出血”, our study may have missed videos indexed under alternative expressions such as “脑溢血,” potentially restricting the overall completeness of the results. Third, although we employed validated assessment tools such as GQS and mDISCERN, the evaluation of video quality inevitably contains a subjective component. While we implemented measures to ensure inter-rater consistency, including using two independent reviewers and an arbitration process, potential inter-rater variability may still exist. Furthermore, as a cross-sectional study, it reveals associations and differences between platforms but cannot establish causality. In addition, due to the difference in user scale between TikTok and BiliBili, without obtaining the information on the scale of viewers for each video (such as the number of plays), directly comparing the engagement metrics like the number of likes and comments across platforms cannot rule out the bias caused by the platform differences. Finally, Although algorithms, user profiling, and content moderation mechanisms vary across platforms, simplifying content to fit platform constraints and thereby reducing rigor is a common trend. These platforms are gradually emerging as important channels for health education, and existing studies have shown that English short videos can be used to disseminate general medical information (e.g., alcohol risk education,⁴¹ early childhood language learning⁴²), but no research has specifically examined stroke-related content. This research gap suggests that our findings may have international relevance, implying that similar adaptations of professional content to platform constraints may exist in other language regions. However, due to differences in language, culture, and healthcare systems, as well as limitations in data availability, these conclusions need to be further validated in non-Chinese contexts by incorporating local platform characteristics. Future research should conduct multinational, multilingual comparative studies to gain deeper insight into how health information is disseminated and evaluated in different socio-technical environments.

Conclusion

This study presents the first assessment and comparison of the content quality, reliability, and dissemination characteristics of ICH-related videos on two major Chinese short-video platforms, TikTok and BiliBili. Our findings clearly demonstrate that BiliBili, leveraging its longer video format and greater tolerance for in-depth content, significantly outperforms TikTok in terms of informational comprehensiveness, overall quality, and reliability. However, TikTok exhibits overwhelming superiority in user engagement metrics, suggesting its short-form, highly immersive model is more conducive to widespread dissemination among the general population. Furthermore, the positive correlation identified between video duration and quality/reliability scores underscores the critical role of sufficient length in delivering substantive content. Conversely, the negative correlation between the number of comments and the GQS score serves as a cautionary note to viewers that high interaction volume should not be simplistically equated with high information quality. Notably, content created by SHPs was found to be of the highest quality, affirming the core value of professional authority in health education. However, a crucial finding is that even videos produced by SHPs failed to meet the standard for high-quality information. This highlights the persistent challenge of effectively translating specialized medical knowledge into accessible, reliable, and practical information for the public.

Footnotes

Acknowledgments

The authors extend their gratitude to all video uploaders included in this study. Their contributions to the dissemination of public health information form the foundation of the value of this research.

ORCID iD

Baisong Huang

Ethical considerations

This study did not involve human participants, clinical data, experimental animals, or histological research. All analyzed data were sourced from publicly available TikTok and BiliBili videos, and data collection strictly complied with the terms of service of TikTok and BiliBili. No private or personally identifiable information was collected or processed, nor was any interaction conducted with users. Therefore, the study qualifies for an ethics exemption and does not require ethics approval.

Author contributions

QWH was responsible for the conceptualization, methodology, and overall supervision of the study. BSH and AQC conducted the investigation, including data extraction and quality assessment. ZT, XY, and YYS performed data editing of the final manuscript. Curation, collection, and formal analysis. The original draft of the manuscript was written by BSH and AQC, and all authors contributed to the subsequent review and editing of the final manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by grants from the National Natural Science Foundation of China (82071335, 82301508) and the Natural Science Foundation of Hubei Province (2023AFB280).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The datasets generated during this study are available from the corresponding author upon reasonable request.*

Generative AI statement

The authors declare that no generative AI or AI-assisted tools were used for writing, data analysis, image generation, or code development in this study.

References

GBD 2023 Disease and Injury and Risk Factor Collaborators . Burden of 375 diseases and injuries, risk-attributable burden of 88 risk factors, and healthy life expectancy in 204 countries and territories, including 660 subnational locations, 1990-2023: a systematic analysis for the Global Burden of Disease Study 2023. Lancet 2025; 406: 1873–1922. https://doi.org/10.1016/S0140-6736(25)01637-X

Cordonnier

Demchuk

Ziai

, et al. Intracerebral haemorrhage: current approaches to acute management. Lancet (London, England) 2018; 392: 1257. https://doi.org/10.1016/S0140-6736(18)31878-6

GBD 2016 Stroke Collaborators . Global, regional, and national burden of stroke, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol 2019; 18: 439–458.

Elhajjar

Ouaida

. Use of social media in healthcare. Health Mark Q 2022; 39: 173–190. https://doi.org/10.1080/07359683.2021.2017389

Toler

Grubbs

. Listening to TikTok - Patient Voices, Bias, and the Medical Record. N Engl J Med 2025; 392: 422–423. https://doi.org/10.1056/NEJMp2410601

Armstrong

Kim

Idriss

, et al. Online video improves clinical outcomes in adults with atopic dermatitis: a randomized controlled trial. J Am Acad Dermatol 2011; 64: 502–507. https://doi.org/10.1016/j.jaad.2010.01.051

QuestMobile 2024 Annual Report on China’s Mobile Internet . QuestMobile 2024 Annual Report on China’s Mobile Internet. https://www.questmobile.com.cn/research/report/1896846900944015361 (accessed 24 September 2025).

QuestMobile . Semi-annual report of CM Net in. 2024. Questmobile. https://www.questmobile.cn/research/reports (accessed 31 January 2026).

Feng

Malloch

Kravitz

, et al. Assessing the effectiveness of a narrative-based patient education video for promoting opioid tapering. Patient Educ Couns 2021; 104: 329–336. https://doi.org/10.1016/j.pec.2020.08.019

10.

O’Connor

Zhang

Honey

, et al. Digital professionalism on social media: A narrative review of the medical, nursing, and allied health education literature. Int J Med Inform 2021; 153: 104514. https://doi.org/10.1016/j.ijmedinf.2021.104514

11.

Song

Zhang

. Interventions to support consumer evaluation of online health information credibility: A scoping review. Int J Med Inform 2021; 145: 104321. https://doi.org/10.1016/j.ijmedinf.2020.104321

12.

Suarez-Lledo

Alvarez-Galvez

. Prevalence of Health Misinformation on Social Media: Systematic Review. J Med Internet Res 2021; 23: e17187. https://doi.org/10.2196/17187

13.

Zhang

, et al. Evaluating the reliability and quality of knee osteoarthritis educational content on TikTok and Bilibili: A cross-sectional content analysis. Digit Health 2025; 11: 20552076251366390. https://doi.org/10.1177/20552076251366390

14.

Rosenberg

Hollander

Gordon

, et al. Evaluating the Quality of TikTok Videos on Vitiligo: A Cross-Sectional Study. Int J Dermatol 2025; 64: 2299. https://doi.org/10.1111/ijd.17962

15.

Che

, et al. The quality and reliability of short videos about hypertension on TikTok: a cross-sectional study. Sci Rep 2025; 15: 25042. https://doi.org/10.1038/s41598-025-08680-1

16.

Lian

Pan

, et al. Assessing the quality of breast cancer-related videos on TikTok: A cross-sectional study. Digit Health 2024; 10: 20552076241277688. https://doi.org/10.1177/20552076241277688

17.

Charnock

Shepperd

Needham

, et al. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health 1999; 53: 105–111. https://doi.org/10.1136/jech.53.2.105

18.

Silberg

Lundberg

Musacchio

. Assessing, controlling, and assuring the quality of medical information on the Internet: Caveant lector et viewor--Let the reader and viewer beware. JAMA 1997; 277: 1244–1245.

19.

Bernard

Langille

Hughes

, et al. A systematic review of patient inflammatory bowel disease information resources on the World Wide Web. Am J Gastroenterol 2007; 102: 2070–2077. https://doi.org/10.1111/j.1572-0241.2007.01325.x

20.

Mondal

Cassese

Candel

MJJM

, et al. Sample size determination for hypothesis testing on the intraclass correlation coefficient in a two-way analysis of variance model. Br J Math Stat Psychol 2025; https://doi.org/10.1111/bmsp.70016

21.

Wang

Yao

Wang

, et al. Bilibili, TikTok, and YouTube as sources of information on gastric cancer: assessment and analysis of the content and quality. BMC Public Health 2024; 24: 57. https://doi.org/10.1186/s12889-023-17323-x

22.

Liu

, et al. Adolescent Addiction to Short Video Applications in the Mobile Internet Era. Front Psychol 2022; 13: 893599. https://doi.org/10.3389/fpsyg.2022.893599

23.

D’Ambrosi

Bellato

Bullitta

, et al. TikTok and frozen shoulder: a cross-sectional study of social media content quality. J Orthop Traumatol 2024; 25: 57. https://doi.org/10.1186/s10195-024-00805-y

24.

Xing

Zhang

, et al. Evaluation of the information quality related to osteoporosis on TikTok. BMC Public Health 2024; 24: 2880. https://doi.org/10.1186/s12889-024-20375-2

25.

Dai

Gong

, et al. The Quality and Reliability of Online Videos as an Information Source of Public Health Education for Stroke Prevention in Mainland China: Electronic Media-Based Cross-Sectional Study. JMIR Infodemiology 2025; 5: e64891. https://doi.org/10.2196/64891

26.

Pandian

Gall

Kate

, et al. Prevention of stroke: a global perspective. Lancet 2018; 392: 1269–1278. https://doi.org/10.1016/S0140-6736(18)31269-8

27.

Endres

Heuschmann

Laufs

, et al. Primary prevention of stroke: blood pressure, lipids, and heart failure. Eur Heart J 2011; 32: 545–552. https://doi.org/10.1093/eurheartj/ehq472

28.

Liu

, et al. Quality assessment of health science-related short videos on TikTok: A scoping review. Int J Med Inform 2024; 186: 105426. https://doi.org/10.1016/j.ijmedinf.2024.105426

29.

Cao

Zhang

Zhu

, et al. Quality of cataract-related videos on TikTok and its influencing factors: A cross-sectional study. Digit Health 2025; 11: 20552076251365086. https://doi.org/10.1177/20552076251365086

30.

Zhu

Wang

, et al. Information quality of videos related to esophageal cancer on tiktok, kwai, and bilibili: a cross-sectional study. BMC Public Health 2025; 25: 2245. https://doi.org/10.1186/s12889-025-23475-9

31.

Liang

Yang

, et al. Quality and reliability of prostate cancer-Videos on TikTok and Bilibili: Cross-sectional content analysis study. Digit Health 2025; 11: 20552076251376263. https://doi.org/10.1177/20552076251376263

32.

Niu

Hao

Yang

, et al. Quality of Pancreatic Neuroendocrine Tumor Videos Available on TikTok and Bilibili: Content Analysis. JMIR Form Res 2024; 8: e60033. https://doi.org/10.2196/60033

33.

Gupta

Mavi

Yadav

. Atopic Dermatitis on TikTok: A Cross-Sectional Analysis of Content Quality and SoC Representation. J Cutan Med Surg 2025; 12034754251375051. https://doi.org/10.1177/12034754251375051

34.

Zhang

, et al. Quality of information in osteoporosis videos on TikTok: a cross-sectional study. Arch Osteoporos 2025; 20: 115. https://doi.org/10.1007/s11657-025-01597-2

35.

Liang

Wang

Song

, et al. Quality and Audience Engagement of Takotsubo Syndrome-Related Videos on TikTok: Content Analysis. J Med Internet Res 2022; 24: e39360. https://doi.org/10.2196/39360

36.

Zhou

Zeng

Yuan

, et al. Content accuracy and reliability of pulmonary nodule information on social media platforms: a cross-platform study of YouTube, Bilibili, and TikTok. Front Med (Lausanne) 2025; 12: 1613526. https://doi.org/10.3389/fmed.2025.1613526

37.

Ren

, et al. The quality and reliability of short videos about premature ovarian failure on Bilibili and TikTok: Cross-sectional study. Digit Health 2025; 11: 20552076251351077. https://doi.org/10.1177/20552076251351077

38.

Wang

Zhang

Cao

, et al. Quality and content evaluation of thyroid eye disease treatment information on TikTok and Bilibili. Sci Rep 2025; 15: 25134. https://doi.org/10.1038/s41598-025-11147-y

39.

Lei

Liao

, et al. Quality and reliability evaluation of pancreatic cancer-related video content on social short video platforms: a cross-sectional study. BMC Public Health 2025; 25: 1919. https://doi.org/10.1186/s12889-025-23130-3

40.

Zheng

Tong

Wan

, et al. Quality and Reliability of Liver Cancer-Related Short Chinese Videos on TikTok and Bilibili: Cross-Sectional Content Analysis Study. J Med Internet Res 2023; 25: e47210. https://doi.org/10.2196/47210

41.

Clement

Wydra

Gobinathan

, et al. Alcohol-Related Content Delivered Through TikTok’s Search Function: A Content Analysis of Top Videos Across Popular Alcohol Terms. J Stud Alcohol Drugs 2025; 86: 862–872. https://doi.org/10.15288/jsad.24-00308

42.

Romano

Abarca

Baehman

. A Low-Cost, Social Media-Supported Intervention for Caregivers to Enhance Toddlers’ Language Learning: Mixed Methods Feasibility and Acceptability Study. JMIR Pediatr Parent 2025; 8: e66175. https://doi.org/10.2196/66175

Quality and reliability of video based information on intracerebral hemorrhage on TikTok and Bilibili: A cross sectional study

Abstract

Background

Methods

Results

Conclusion

Keywords

Introduction

Methods

Video retrieval and screening strategy

Data extraction and classification

Quality and reliability assessment

Statistical analysis

Results

General characteristics of videos

Analysis of video characteristics by platform

Analysis of video characteristics by uploader

Correlation analysis

Discussion

Conclusion

Footnotes

Acknowledgments

ORCID iD

Ethical considerations

Author contributions

Funding

Declaration of conflicting interests

Data Availability Statement

Generative AI statement

References