Sage Journals: Discover world-class research

Abstract

Objective

With rising kidney stone prevalence in China (currently 7.54%), patients increasingly seek health information through short video platforms like TikTok and Bilibili. However, the quality and reliability of kidney stone-related content on these platforms remains unclear, potentially affecting patient understanding and health decisions.

Methods

In this cross-sectional study, we analyzed 172 kidney stone videos from TikTok (n=95) and Bilibili (n=77). Videos were categorized by uploader type: professional individuals (67.44%), nonprofessional individuals (22.09%), professional institutions (6.40%), and nonprofessional institutions (4.07%). Quality assessment utilized the Global Quality Score (GQS) and modified DISCERN tool, evaluating content comprehensiveness across six domains: definition, symptoms, risk factors, evaluation, management, and outcomes. Statistical analyses compared platform differences and uploader type variations.

Results

TikTok demonstrated significantly higher engagement metrics (median views: 140,255 vs. 10,489; comments: 207 vs. 37; likes: 703 vs. 78) but shorter video duration (53s vs. 121s, p<0.001). Although the median DISCERN score was 2 on both platforms, the distribution was significantly different (p=0.013), with Bilibili videos achieving higher scores (median GQS: 3.0 vs. 2.0, p=0.002; DISCERN: 2.0 vs. 2.0, p=0.013). Professional institutions produced highest-quality content across both platforms (GQS: 3.0-4.0; DISCERN: 2.0-3.0), significantly outperforming nonprofessional creators (p<0.001). Content analysis revealed inadequate coverage of comprehensive kidney stone education, with most videos focusing on basic symptoms and management rather than prevention and risk factors.

Conclusion

While professional creators maintain higher content quality, overall kidney stone information quality on short video platforms remains suboptimal. Platform-specific differences suggest Bilibili’s longer format enables more comprehensive education despite lower engagement. Enhanced content standards and professional creator incentivization are needed to improve kidney stone health education on social media platforms.

Keywords

kidney stones social media information quality reliability TikTok Bilibili

1. Introduction

Kidney stone disease represents a significant global public health challenge, with prevalence rates increasing substantially over the past three decades.¹ In China, systematic reviews demonstrate a dramatic rise from approximately 4% in the 1990s to 7.54% currently, with marked regional variations ranging from 1.36% to 13.69% across different provinces.^2,3 This condition not only causes severe acute pain and potential complications but also imposes considerable economic burden on healthcare systems, with recurrence rates reaching 50% within 10 years without appropriate prevention strategies.⁴

The digital transformation of health information seeking has fundamentally altered patient behavior in accessing medical knowledge. Short-video platforms like TikTok and Bilibili have emerged as primary sources for health-related content, attracting billions of users globally due to their accessibility, visual appeal, and simplified information presentation.^5–7 Contemporary surveys indicate that over 84% of patients seek online health information before or after medical consultations, with video content increasingly preferred over traditional text-based resources.

However, the quality and reliability of health information on social media platforms remain concerning. Unlike peer-reviewed medical literature, social media content lacks systematic quality control mechanisms, potentially exposing users to inaccurate or misleading information.⁸ Recent studies examining various medical conditions have documented significant quality variations across platforms and content creators. While some studies report varying levels of factual inaccuracies, a more prevalent concern is the high rate of suboptimal information quality and the absence of comprehensive clinical guidance on short-form video platforms.^9,10 These quality disparities are particularly problematic for kidney stone management, where accurate information regarding prevention, dietary modifications, and treatment options is crucial for optimal patient outcomes.

Research on health information quality assessment has established validated tools including the DISCERN instrument and Global Quality Score (GQS) for evaluating medical content reliability and educational value.^11,12 Contemporary studies on health-related social media content have revealed consistent patterns: professional healthcare providers typically produce higher-quality content compared to non-medical creators and represent a substantial proportion of health content creators on major platforms.^13–15 Platform-specific differences also emerge, with longer-form content generally achieving superior quality scores due to comprehensive information coverage capabilities.¹⁶

Despite growing reliance on social media for kidney stone information, systematic evaluation of content quality on Chinese platforms remains limited. Existing research has primarily focused on Western platforms and other urological conditions, leaving a significant knowledge gap regarding kidney stone information quality on TikTok and Bilibili.^17,18 Understanding these quality patterns is essential for developing targeted interventions to improve patient education and clinical outcomes.

This study aims to comprehensively evaluate the quality and reliability of kidney stone-related information on TikTok and Bilibili using validated assessment tools. Specifically, we sought to: (1) assess content quality using the Global Quality Score and modified DISCERN criteria; (2) compare quality differences between platforms and content creator types; (3) analyze content comprehensiveness across essential kidney stone education domains; and (4) examine the relationship between video characteristics and quality metrics. These findings will inform evidence-based strategies for improving kidney stone health education on social media platforms.

2. Materials and methods

2.1. Study design and search strategy

This cross-sectional study evaluated the quality and reliability of kidney stone-related videos on TikTok and Bilibili platforms. Data collection was performed on August 29, 2025 using systematic search strategies to ensure comprehensive sampling and minimize selection bias.

Video retrieval was conducted using the Chinese search term “kidney stones” on both platforms. To eliminate potential bias from personalized recommendation algorithms, all searches were performed using newly created accounts with cleared browsing histories and cache data. The search was conducted in incognito mode to prevent algorithmic influence from previous search patterns. Videos were ranked according to each platform’s default sorting criteria, and the top 100 videos from each platform were systematically screened for inclusion.

In line with prior research indicating that videos ranked below the top 100 have a negligible effect on results, our analysis was confined to this initial set.^19–21 All included videos were in Chinese or in English with accurate Chinese subtitles. We applied specific exclusion criteria: duplicate videos, videos exclusively in English without Chinese subtitles, videos without audio, and those not directly related to kidney stone were omitted (Figure 1). For each video, we recorded its title, uploader details (name and identity), duration, content, engagement metrics (likes, comments, shares, saves), and days since publication. All data were logged in a Microsoft Excel spreadsheet.

Figure 1.

The flow chart of this study.

2.2. Video categorization and data extraction

Following established methodology from recent health information quality studies, videos were categorized based on uploader identity and institutional affiliation^22,23 The videos were categorized into four groups according to the uploader type, defined as follows.

1 Professional individuals: Persons with formal medical or healthcare qualifications (e.g., licensed physicians, urologists, nephrologists, nurses) or relevant academic backgrounds (medical researchers specializing in urology, nephrology, or kidney stone disease, registered dietitians with expertise in renal nutrition, and so on).

2 Nonprofessional individuals: Persons without formal medical/healthcare training (e.g., kidney stone patients sharing personal experiences, wellness enthusiasts without clinical credentials, individuals providing dietary advice without professional qualifications, and so on).

3 Professional institutions: Organizations with explicit medical or healthcare mandates (hospitals, urology departments, nephrology centers, academic medical institutions, national health agencies like the Chinese Center for Disease Control and Prevention, professional medical associations such as urological societies, and so on).

4 Nonprofessional institutions: Organizations without a primary focus on healthcare (general media outlets, wellness brands, dietary supplement companies, community groups unrelated to medical practice, commercial entities promoting kidney stone-related products, and so on).

For each included video, comprehensive metadata were systematically extracted: platform source, upload date, video duration (seconds), engagement metrics (views, likes, comments, collections, shares), uploader follower count, and content creator credentials. Two independent reviewers performed data extraction using standardized forms to ensure consistency and minimize extraction bias. The extracted data were recorded in Excel (Microsoft Inc).

2.3. Content completeness assessment

Video content was evaluated across six essential kidney stone education domains based on current clinical guidelines and recent health information quality research.^22–25 These domains included: (1) Definition - explanation of kidney stone formation and types; (2) Symptoms - clinical presentation and pain characteristics; (3) Risk factors - dietary, genetic, and environmental contributors; (4) Evaluation - diagnostic procedures and imaging studies; (5) Management - treatment options including medical and surgical interventions; and (6) Outcomes - prognosis, complications, and prevention strategies.

Each domain was scored using a standardized rubric: 0 points (no content provided), 0.5 points (minimal content with limited detail), 1 point (some content with moderate detail), 1.5 points (most content with good detail), and 2 points (extensive content with comprehensive coverage). This scoring system allowed for nuanced assessment of content depth and educational value across different topic areas.

2.4. Quality and reliability assessment

Two validated instruments were employed to assess video quality and reliability. The Global Quality Score (GQS) utilized a 5-point Likert scale (1 = poor quality to 5 = excellent quality) evaluating overall educational value, flow of information, and usefulness for patient education. This tool has demonstrated validity across multiple health topics and video platforms in recent studies.^15,26,27

The modified DISCERN tool assessed video reliability through five criteria: (1) clarity of information presentation; (2) relevance to kidney stone education; (3) traceability of information sources; (4) robustness of evidence presented; and (5) fairness and balance of content. Each criterion was scored as 0 (criterion not met) or 1 (criterion met), yielding total scores ranging from 0 to 5 points.^22,28

2.5. Inter-rater reliability

Two investigators (reviewer A: JZ; reviewer B: JG) with medical backgrounds independently evaluated all videos after completing standardized training sessions to ensure assessment consistency. Training included review of scoring criteria, practice evaluations using pilot videos, and discussion of discrepancies until consensus was achieved. Inter-rater reliability was assessed using Cohen’s kappa coefficient, with values ≥0.8 considered excellent agreement.²⁹ Any discrepancies in final scoring were resolved through discussion and consensus between evaluators.

2.6. Statistical analysis

Statistical analyses were performed using IBM SPSS Statistics 27.0 software. Descriptive statistics were calculated for all variables, with categorical data presented as frequencies and percentages, and continuous variables described using medians and interquartile ranges (IQR) due to non-normal distributions. Platform comparisons were conducted using Mann-Whitney U tests for continuous variables and chi-square tests for categorical variables. Kruskal-Wallis tests were employed for multi-group comparisons among uploader categories. Spearman correlation analysis examined relationships between video characteristics (duration, engagement metrics) and quality scores. Statistical significance was set at p < 0.05 for all analyses. Given the exploratory nature of this study, no formal correction for multiple comparisons (e.g., Bonferroni correction) was applied. The large number of comparisons conducted across quality instruments, uploader types, content dimensions, and correlation analyses increases the risk of Type I errors (false-positive findings). Accordingly, all reported p-values should be interpreted in the context of this multiple testing framework, and statistically significant results—particularly those from subgroup analyses—should be considered hypothesis-generating rather than confirmatory.

3. Results

3.1. Video selection and characteristics

Following systematic screening of the top 100 videos from each platform, 172 videos met inclusion criteria for final analysis: 95 (55.23%) from TikTok and 77 (44.77%) from Bilibili. The video selection process excluded 28 videos due to duplication, commercial content, or insufficient educational value.

Professional individuals dominated content creation, accounting for 116 videos (67.44%), followed by nonprofessional individuals with 38 videos (22.09%), professional institutions with 11 videos (6.40%), and nonprofessional institutions with 7 videos (4.07%). This distribution demonstrates substantial involvement of healthcare professionals in kidney stone education on short video platforms, significantly exceeding typical professional representation rates reported in previous social media health studies (Figure 2).

Figure 2.

General information on kidney stone videos from TikTok and Bilibili. (a): Circular pie chart showing the percentage of uploader types on all platforms (b): Percentage stacked bar chart showing the percentage of uploader types on different platforms (TikTok and Bilibili).

3.2. Platform-specific differences

Significant differences emerged between TikTok and Bilibili across multiple video characteristics and engagement metrics, as detailed in Table 1. TikTok demonstrated markedly superior user engagement, with median views of 140,255 (IQR: 14,617-1,168,530) compared to Bilibili’s 10,489 (IQR: 4,008-46,353.5, p<0.001). Similar patterns were observed for comments (207 vs. 37, p<0.001), likes (703 vs. 78, p<0.001), collections (226 vs. 62, p=0.013), and shares (294 vs. 31, p<0.001).

Table 1.

Comparison of characteristics between different short-video platforms.

Variables	TikTok (n=95)	Bilibili (n=77)	χ²/Z	p-value^a
Uploader type [n (%)]			9.472	0.023 ^b
Professional individuals	72 (75.8%)	44 (57.1%)
Professional institutions	4 (4.2%)	7 (9.1%)
Nonprofessional individuals	18 (18.9%)	20 (26.0%)
Nonprofessional institutions	1 (1.1%)	6 (7.8%)
Days since published, median (IQR)	66 (8-247)	631 (173.5-1049)	-6.139	<0.001
Views, median (IQR)	140255 (14617-1168530)	10489 (4008-46353.5)	-5.926	<0.001
Comments, median (IQR)	207 (32-1186)	37 (4.5-269)	-4.412	<0.001
Likes, median (IQR)	703 (92-5591)	78 (27.5-482)	-4.810	<0.001
Collections, median (IQR)	226 (20-1463)	62 (15.5-258.5)	-2.476	0.013
Shared, median (IQR)	294 (16-3629)	31 (6-228)	-3.730	<0.001
Duration, median (IQR)	53 (25-83)	121 (68-309)	-6.636	<0.001
Followers, median (IQR)	12387 (1597-55390)	1739 (269-14120.5)	-3.717	<0.001
GQS score, median (IQR)	2 (2-3)	3 (2-3)	-3.026	0.002
DISCERN score, median (IQR)	2 (1-2)	2 (1-2)	-2.485	0.013

GQS: Global Quality Score; IQR: interquartile range.

^aMann-Whitney U tests.

^bFisher’s exact test.

Creator demographics differed significantly between platforms (p=0.023). TikTok exhibited higher professional individual representation (75.8% vs. 57.1%), while Bilibili showed greater nonprofessional individual participation (26.0% vs. 18.9%) and nonprofessional institutional presence (7.8% vs. 1.1%). This distribution suggests platform-specific preferences among different content creator categories (Figure 2). Video duration revealed stark contrasts, with TikTok videos averaging 53 seconds (IQR: 25-83) compared to Bilibili’s 121 seconds (IQR: 68-309, p<0.001). This fundamental difference reflects inherent platform design constraints and user engagement patterns. Creator influence also varied substantially, with TikTok uploaders demonstrating significantly higher follower counts (median: 12,387; IQR: 1,597-55,390) compared to Bilibili creators (median: 1,739; IQR: 269-14,120.5, p<0.001).

3.3. Quality assessment outcomes

Quality assessment revealed significant platform-specific performance differences using validated instruments. Bilibili videos achieved superior Global Quality Scores with a median of 3 (IQR: 2-3) compared to TikTok’s median of 2 (IQR: 2-3, p=0.002). This superior performance likely reflects Bilibili’s extended format enabling more comprehensive information coverage. Modified DISCERN scores also favored Bilibili, with median scores of 2 (IQR: 1-2) compared to TikTok’s 2 (IQR: 1-2, p=0.013), though the difference was more modest than GQS results. Both platforms demonstrated suboptimal reliability scores, indicating widespread challenges in achieving evidence-based health information standards on short video platforms (Figure 3(a) and (b)).

Figure 3.

Comparison of the quality and reliability of kidney stone videos. (a) and (b): TikTok and Bilibili. (c) and (d): Different uploader types. *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001.

3.4. Content creator analysis

Professional institutions (n=11) tended to produce the highest-quality content across both assessment instruments, as shown in Table 2, achieving median GQS scores of 3–4 and DISCERN scores of 2–3. Although the overall group comparison was statistically significant (p<0.001 for both GQS and DISCERN), the small sample size of the professional institution subgroup warrants cautious interpretation of these differences. Professional individuals demonstrated intermediate performance with median GQS scores of 3 (IQR: 2-3) and DISCERN scores of 2 (IQR: 2-2).

Table 2.

Comparison of different uploader types.

Variables	Professional individuals (n=116)	Professional institutions (n=11)	Nonprofessional individuals (n=38)	Nonprofessional institutions (n=7)	H	p-value^a
Days since published, median (IQR)	117 (14.25-593.75)	1047 (202-1642)	221.5 (32-570.5)	999 (65-1541)	14.888	0.002
Views, median (IQR)	50397 (6934.5-298023.5)	12612 (3853-51995)	37925 (8450.25-147014)	99484 (12134-1077781)	4.480	0.214
Comments, median (IQR)	105.5 (10.25-564.75)	12 (3-34)	256 (65.5-868.25)	573 (60-1516)	12.872	0.005
Likes, median (IQR)	275.5 (59-2051.25)	75 (19-499)	225.5 (66.5-1101)	1064 (191-8453)	4.762	0.190
Collections, median (IQR)	122.5 (15.25-883)	145 (18-967)	84.5 (20.25-410)	193 (96-1463)	1.910	0.591
Shared, median (IQR)	65 (6.25-1001.75)	64 (25-410)	38.5 (13.5-355.75)	374 (59-6115)	3.245	0.355
Duration, median (IQR)	65 (44.25-107.5)	160 (83-892)	113.5 (33.25-334.75)	107 (72-313)	12.859	0.005
Followers, median (IQR)	12086 (659-61890.25)	592 (494-4934)	1901.5 (126.5-6758.25)	52624 (5674-1172766)	20.392	<0.001
GQS score, median (IQR)	3 (2-3)	3 (3-4)	2 (1-2)	3 (2-3)	53.429	<0.001
DISCERN score, median (IQR)	2 (2-2)	2 (2-3)	1 (0-1)	2 (1-2)	50.723	<0.001

GQS: Global Quality Score; IQR: interquartile range.

^aKruskal–Wallis rank sum test.

Nonprofessional content creators showed markedly lower quality metrics. Nonprofessional individuals achieved median GQS scores of 2 (IQR: 1–2) and DISCERN scores of 1 (IQR: 0–1). Nonprofessional institutions (n=7) showed median GQS scores of 3 (IQR: 2–3) and DISCERN scores of 2 (IQR: 1–2); however, given the very small number of videos in this category, these results should be interpreted with considerable caution and may not be representative of nonprofessional institutional content more broadly. These patterns are broadly consistent with prior research suggesting quality advantages for professionally-created health content, though direct comparisons are limited by subgroup size (Figure 3(c) and (d)).

Video duration showed significant associations with creator type (p=0.005). Professional institutions produced longer content (median: 160 seconds; IQR: 83-892), followed by nonprofessional individuals (113.5 seconds; IQR: 33.25-334.75), while professional individuals created more concise videos (65 seconds; IQR: 44.25-107.5). This pattern suggests different educational approaches among creator categories.

3.5. Content comprehensiveness analysis

Content analysis revealed substantial gaps in comprehensive kidney stone education across all video categories, as shown in Supplemental Table 1 and Supplemental Table 2. Most videos provided inadequate coverage of essential educational domains, with over 60% offering no content or minimal information in critical areas. Definition coverage proved most comprehensive, with 44.2% of videos providing some, most, or extensive content, though 62.2% still offered no definitional information. Symptom coverage achieved similar levels (45.9% with adequate content), while management information was addressed in 44.1% of videos. However, critical educational domains received minimal attention: risk factors were adequately covered in only 18.6% of videos, evaluation procedures in 15.1%, and outcomes in 16.3%.

Platform-specific content patterns emerged from detailed analysis (Figure 4). TikTok videos more frequently omitted basic definitional information (68.4% vs. 54.5% for Bilibili) and risk factor discussions (83.2% vs. 79.2%). Bilibili demonstrated superior coverage of advanced topics, with higher proportions addressing comprehensive outcomes (25.9% vs. 13.7% for TikTok) and evaluation procedures (19.5% vs. 11.6%).

Figure 4.

Comparison of content comprehensiveness between different short-video platforms.

Professional institutions provided the most comprehensive content across all domains, as demonstrated in Supplemental Table 3. These creators achieved extensive coverage rates of 27.2% for definitions, 27.2% for symptoms, 36.3% for risk factors, and 27.2% for outcomes. In contrast, nonprofessional individuals rarely provided extensive coverage in any domain, with maximum rates of 7.8% for definitions and no extensive coverage for several critical areas (Figure 5).

Figure 5.

Comparison of content comprehensiveness between different uploader types.

3.6. Correlation analysis

Spearman correlation analysis revealed platform-specific relationships between video characteristics and quality metrics (Figure 6). On TikTok, strong positive correlations emerged between engagement metrics, with views, comments, likes, collections, and shares showing correlation coefficients ranging from 0.85-0.96 (p<0.001). However, engagement metrics showed weak correlations with quality scores (GQS: r=0.26-0.39; DISCERN: r=0.25-0.47), suggesting that popular content does not necessarily equate to educational quality. Bilibili demonstrated different correlation patterns, with moderate engagement metric intercorrelations (r=0.63-0.94) but similarly weak relationships between engagement and quality measures. Notably, video duration on Bilibili showed modest positive correlations with quality scores (GQS: r=0.05; DISCERN: r=0.09), supporting the hypothesis that longer formats enable more comprehensive education. Days since publication showed interesting platform differences. TikTok videos demonstrated positive correlations between publication age and engagement (r=0.60-0.73), while Bilibili showed weaker associations (r=0.26-0.43). This pattern suggests different content discovery and sharing mechanisms between platforms.

Figure 6.

Correlation matrix of video engagement metrics and quality scores on Bilibili and TikTok.

3.7. Quality distribution patterns

Quality score distributions revealed concerning patterns across both platforms. The majority of videos achieved suboptimal quality ratings, with 68.6% scoring ≤2 on the GQS scale and 71.5% scoring ≤2 on the modified DISCERN assessment. Only 8.7% of videos achieved GQS scores of 4 or higher, indicating excellent educational quality, while merely 3.5% received DISCERN scores of 4-5, suggesting high reliability standards.³⁰

Professional content consistently outperformed nonprofessional material, but even professional videos showed substantial quality variation. Among professional individuals, 34.5% achieved GQS scores of 4-5, compared to 7.9% of nonprofessional individuals. Similarly, 15.5% of professional videos met high reliability standards (DISCERN ≥4) versus 2.6% of nonprofessional content.

These findings underscore the pervasive challenge of maintaining educational quality standards on social media platforms, even among healthcare professionals, and highlight the urgent need for improved content creation guidelines and quality assurance mechanisms for health-related video content.

4. Discussion

To our knowledge, this is one of the first comprehensive evaluations of kidney stone health information quality on Chinese short video platforms, revealing significant quality disparities between platforms, content creators, and educational domains. Our findings demonstrate that while professional healthcare providers consistently produce higher-quality content, overall information quality remains suboptimal across both TikTok and Bilibili, with important implications for patient education and clinical outcomes.

4.1. Platform-specific quality patterns

The superior quality scores achieved by Bilibili videos (median GQS: 3.0 vs. 2.0; DISCERN: 2.0 vs. 2.0) align with recent comparative platform research, where longer-form content consistently enables more comprehensive health education.^16,22 Our findings parallel those of recent studies demonstrating similar quality advantages for Bilibili over TikTok in various health topics, suggesting that platform architecture fundamentally influences educational potential.³⁰ The extended duration capacity of Bilibili (median: 121 seconds vs. 53 seconds) allows content creators to address multiple educational domains comprehensively, whereas TikTok’s format constraints limit depth of coverage.

Observable differences in platform format characteristics were associated with distinct patterns in content quality and scope. TikTok videos were substantially shorter (median 53 seconds vs. 121 seconds for Bilibili), a structural constraint that may limit creators’ ability to address multiple educational domains comprehensively. This format difference is consistent with the more frequent omission of basic definitional information on TikTok (68.4% vs. 54.5% for Bilibili) and risk factor discussions (83.2% vs. 79.2%). Conversely, Bilibili’s longer-form format appears conducive to broader educational coverage, as reflected in superior proportions of videos addressing outcomes (25.9% vs. 13.7% for TikTok) and evaluation procedures (19.5% vs. 11.6%). Whether these patterns are attributable to platform design, creator self-selection, audience expectations, or content ranking mechanisms cannot be determined from the present observational data.

However, TikTok’s engagement superiority presents a paradoxical challenge for kidney stone health communication. Despite lower educational quality, TikTok videos achieved dramatically higher audience reach (median views: 140,255 vs. 10,489), suggesting broader public health impact potential but potentially exposing millions of Chinese patients to incomplete information. This engagement-quality discrepancy mirrors findings from recent coronavirus misinformation studies, where entertaining but inaccurate content has been observed to achieve wider reach than evidence-based educational material on video-sharing platforms.⁹ The weak associations between engagement metrics and quality scores (r=0.26–0.47) in our data are consistent with findings that content popularity does not reliably reflect informational quality. Prior research has described this pattern as ‘misinformation amplification,’ in which lower-quality health content achieves disproportionate visibility on short-video platforms; however, whether such amplification results from algorithmic mechanisms, audience preferences, or creator strategies was not examined in our study. This pattern is consistent across multiple health topics and platforms reported in the existing literature, suggesting our findings may reflect broader challenges in health information dissemination on short-video platforms rather than kidney stone-specific phenomena.^31–33

4.2. Professional creator impact

Professional institutions achieved the highest quality scores (GQS: 3.0-4.0; DISCERN: 2.0-3.0), consistent with recent lymphedema research where institutional content demonstrated superior reliability across multiple assessment domains.²² This quality advantage likely stems from institutional review processes, evidence-based content development protocols, and accountability mechanisms absent in individual creator workflows. Professional individuals, while outperforming nonprofessional creators, showed notable quality variation (GQS range: 1-5), suggesting that medical training alone does not guarantee high-quality social media content creation. This finding aligns with recent analyses of physician-created content, where clinical expertise did not consistently translate to effective patient education materials.^22,34 The challenge may reflect limited training in health communication principles, social media content optimization, or evidence-based patient education methodologies among healthcare providers.

Professional institutions (n=11) demonstrated relatively higher quality scores compared to individual creators, a finding tentatively consistent with the hypothesis that institutional review processes, evidence-based content development protocols, and accountability mechanisms may support higher-quality output. However, given the small number of professional institution videos in this sample, this comparison should be interpreted as exploratory rather than confirmatory. If replicated in larger samples, this pattern would suggest that healthcare organizations could play an important role in improving social media health information quality by supporting systematic content development approaches.

4.3. Content comprehensiveness deficiencies

The widespread inadequacy in comprehensive kidney stone education coverage represents a significant public health concern with potential implications for patient understanding and health decision-making. With 62.2% of videos providing no definitional information and 81.4% failing to address risk factors adequately, patients may develop incomplete or inaccurate understanding of their condition. This educational gap is particularly concerning given kidney stone recurrence rates of 50% within 10 years, where prevention knowledge proves crucial for clinical outcomes.⁴ Chinese surveys indicate that 70% of kidney stone patients express strong desire to understand prevention methods,² yet our findings reveal that only 18.6% of videos adequately covered risk factors, creating a stark disconnect between patient educational needs and available social media content.

The predominant focus on symptom management (45.9% adequate coverage) versus prevention strategies (18.6% adequate coverage) reflects a treatment-oriented rather than prevention-focused approach inconsistent with current clinical guidelines. Recent American Urological Association guidelines emphasize prevention as the cornerstone of kidney stone management, recommending comprehensive patient education covering dietary modifications, fluid management, and risk factor identification.²³ Our findings suggest social media content fails to reflect these evidence-based priorities, which may limit patients’ awareness of prevention strategies and their capacity for informed self-management. The gap between symptom-focused content and prevention-oriented guidelines is noteworthy: patients who primarily encounter symptom and treatment information without guidance on dietary modifications and risk factor reduction may have fewer opportunities to engage in proactive behaviors. Whether this content gap translates into altered patient behavior or clinical outcomes was not assessed in this study and warrants direct investigation through prospective research designs.

Comparison with recent kidney stone social media research reveals consistent patterns of educational inadequacy. A 2024 study analyzing YouTube and TikTok content found similar deficiencies in prevention coverage (12% vs. our 18.6%) and risk factor discussion (15% vs. our 18.6%), suggesting systemic challenges in comprehensive kidney stone education across multiple platforms and geographic regions.³⁵ This consistency indicates that content creation challenges transcend individual platform characteristics or cultural contexts, representing a global phenomenon requiring targeted interventions.

4.4. Clinical implications and misinformation risks

While our study did not systematically quantify misinformation prevalence, the low reliability scores (71.5% achieving DISCERN ≤2) suggest substantial potential for inaccurate information dissemination with implications for patient understanding. The weak correlation between engagement and quality scores (r=0.26–0.47) is consistent with broader concerns that content popularity on short-video platforms may not reliably reflect informational quality. Although clinical outcomes were not measured in this study, prior behavioral research has reported associations between exposure to low-quality health information and outcomes such as increased health anxiety, delayed healthcare seeking, and reduced treatment adherence.²⁵ The high engagement rates observed for lower-quality content in this study highlight the potential for large numbers of patients to encounter and be influenced by unreliable kidney stone information; however, whether such exposure affects actual health-seeking behavior or clinical outcomes requires direct examination in future patient-centered research.

The predominance of symptom-focused content over prevention education may contribute to reactive rather than proactive healthcare behaviors. Patients learning primarily about stone passage and pain management without understanding prevention strategies miss opportunities for dietary modifications and lifestyle changes that could prevent recurrence. This educational gap is noteworthy given that kidney stone recurrence rates remain high at 50% within 10 years, and evidence-based prevention strategies—including dietary modification and fluid management—depend on patient awareness and understanding.⁴ While a causal link between social media content quality and recurrence rates cannot be established from the present data, improving the prevention-oriented content available on these platforms represents a plausible avenue for supporting patient self-management.

4.5. Algorithmic and technological considerations

The inverse relationship between content quality and engagement metrics reflects fundamental conflicts between social media business models and public health objectives. The inverse relationship between content popularity and educational quality observed in this study is consistent with a broader tension between the design priorities of short-video platforms and the needs of public health communication.^36–38 Platforms built around engagement optimization may structurally favor brief, emotionally salient content formats—a pattern that could disadvantage comprehensive, evidence-based health education. While prior research has explored concepts such as algorithmic ‘filter bubbles’ in health information contexts.^39,40 Our study did not examine content recommendation patterns or user exposure pathways. The weak associations between engagement metrics and quality scores (r=0.26–0.47) are consistent with the hypothesis that high-visibility content does not necessarily reflect high educational value, though the mechanisms underlying this pattern require direct investigation.

Technological solutions for improving health information quality on social media platforms require careful consideration of both algorithmic modification and content creator support systems. Proposed interventions may include content quality labeling systems, professional credential verification for health-related creators, and automated quality feedback tools available during the upload process.^41,42 Platform developers and public health stakeholders could also explore partnership programs with healthcare organizations for content review, as well as transparency mechanisms that help users identify creator credentials. These platform-level strategies do not presuppose specific algorithmic changes but focus on observable, modifiable aspects of content presentation and creator accountability.

4.6. Limitations and future research directions

Several limitations constrain the generalizability and interpretation of our findings. The cross-sectional design captures only a snapshot of content quality at a single time point; given the dynamic nature of social media, video availability and quality may have changed substantially since data collection. Video retrieval was performed using only the single Chinese search term '肾结石' (kidney stones), which may have excluded relevant content indexed under symptom-based terms, treatment-specific queries, colloquial expressions, or platform-specific hashtags; consequently, the sample may not fully represent the entire spectrum of kidney stone-related content available on these platforms. Future studies should employ multi-term search strategies or systematic hashtag mapping to enhance retrieval comprehensiveness. Additionally, this study may be subject to survivorship bias, as low-quality or inaccurate videos may have been removed by platforms prior to data collection, potentially leading to an overestimate of overall content quality.

Two related statistical limitations also warrant explicit acknowledgment. The large number of comparisons performed across quality dimensions, uploader categories, content domains, and engagement metrics substantially increases the probability of spurious significant findings (Type I error). As no formal correction for multiple comparisons was applied—consistent with the exploratory intent of this study—all reported p-values should be interpreted with appropriate caution, and findings should be regarded as hypothesis-generating rather than confirmatory. Compounding this concern, the marked imbalance in subgroup sizes—with professional institutions (n=11) and nonprofessional institutions (n=7) comprising very few videos relative to professional individuals (n=116)—substantially limited statistical power for comparisons involving these categories, increasing the risk of Type II errors (false-negative findings). Conclusions pertaining to institutional uploaders in particular should be treated as preliminary, and replication in larger, more balanced samples is needed before definitive inferences can be drawn.

Limitations related to measurement and generalizability should also be noted. Although GQS and modified DISCERN are validated instruments widely used in health information research, quality assessment inevitably involves some degree of subjective interpretation, even with standardized rater training and excellent inter-rater reliability (κ>0.8). These tools may also not capture all dimensions of quality relevant to short-video health content, such as cultural appropriateness and patient engagement. Our focus on Chinese platforms similarly limits international generalizability, though the patterns identified are broadly consistent with findings from analogous studies on Western platforms, suggesting these challenges may reflect wider trends in social media health communication.

This study did not examine the actual impact of video exposure on patient knowledge, attitudes, or clinical outcomes—an important limitation given that such evidence would be essential for establishing the practical significance of the quality gaps identified. Future research should prioritize prospective, patient-centered designs that directly assess the relationship between social media health information consumption and health-seeking behavior and clinical outcomes. Longitudinal studies examining changes in content quality over time would also provide valuable insight into the effects of platform evolution and targeted interventions.

4.7. Recommendations for stakeholders

Healthcare institutions should develop comprehensive social media content strategies recognizing both platforms’ potential and limitations. Bilibili’s superior educational capacity suggests prioritizing this platform for detailed educational content, while TikTok’s engagement advantages may prove valuable for awareness campaigns and directing users to comprehensive resources. Professional institutions should leverage their demonstrated quality advantages by increasing content creation efforts and providing guidance to individual healthcare providers engaging in social media education. Regulatory agencies should consider developing guidelines for health information on social media platforms, balancing free expression with public health protection through establishing minimum quality standards for health content, requiring disclosure of medical credentials for health-related content creators, and mandating platform responsibility for identifying and addressing health misinformation.

Patient education initiatives should incorporate social media literacy components, teaching users to critically evaluate health information quality, identify reliable sources, and understand the limitations of social media for medical decision-making. Healthcare providers should proactively discuss social media health information with patients, addressing misconceptions and providing guidance for identifying trustworthy sources. Future research directions should focus on developing automated quality assessment tools for real-time content evaluation, examining the effectiveness of different intervention strategies for improving content quality, and conducting longitudinal studies to assess the impact of improved social media health education on patient outcomes and clinical metrics.

5. Conclusion

This cross-sectional study reveals concerning quality deficiencies in kidney stone health information on TikTok and Bilibili platforms, despite high professional creator representation (67.44%). While Bilibili demonstrated superior educational quality due to extended content format, both platforms showed suboptimal reliability scores and inadequate coverage of essential prevention and risk factor information. Professional institutions consistently produced highest-quality content, yet overall information quality remained insufficient for optimal patient education. The inverse relationship between engagement metrics and educational quality highlights systematic challenges in social media health communication. These findings underscore the urgent need for enhanced content standards, professional creator training, and platform-specific quality assurance mechanisms to ensure accurate kidney stone information reaches patients seeking health guidance through short video platforms.

Supplemental material

Supplemental material - Quality and reliability of kidney stone information on TikTok and Bilibili: A cross-sectional study

Supplemental material for Quality and reliability of kidney stone information on TikTok and Bilibili: A cross-sectional study by Wenbin Li, Jianhua Zhang, Jie Guo, Lan Wang, Yujie Ma, Ming Li, Junqing Hou and Song Li in Digital Health.

Supplemental material

Supplemental material - Quality and reliability of kidney stone information on TikTok and Bilibili: A cross-sectional study

Supplemental material

Supplemental material - Quality and reliability of kidney stone information on tiktok and bilibili: A cross-sectional study

Footnotes

ORCID iD

Wenbin Li

Ethical considerations

This study did not involve clinical data, human specimens, or laboratory animals. All data were obtained from publicly available videos on TikTok and Bilibili, ensuring that no personal privacy issues were involved. Since the study did not include any user interaction, an ethics review was not required.

Author contributions

All authors contributed to the study conception and design. The first draft of the manuscript was written by Wenbin Li. Data collection and experimental work were performed by Jianhua Zhang and Jie Guo. Yujie Ma and Lan Wang conducted the data analysis. Ming Li, Junqing Hou and Song Li critically reviewed the manuscript and were responsible for supervision and funding acquisition. All authors read and approved the final manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Science and Technology Development Plan of Henan Province (Grant Nos. 252102311056 and 242102310065) and the Science and Technology Development Plan of Kaifeng City (Grant No. 2403093).

Declaration of conflicting interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Data Availability Statement

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.*

Supplemental material

Supplemental material for this article is available online.

References

Sorokin

Mamoulakis

Miyazawa

, et al. Epidemiology of stone disease across the world. World J Urol 2017; 35: 1301–1320. https://doi.org/10.1007/s00345-017-2008-6

Wang

Fan

Huang

, et al. Prevalence of kidney stones in mainland china: a systematic review. Sci Rep 2017; 7: 41630. https://doi.org/10.1038/srep41630

Zeng

. Age-specific prevalence of kidney stones in chinese urban inhabitants. Urolithiasis 2013; 41: 91–93. https://doi.org/10.1007/s00240-012-0520-0

Rule

Lieske

, et al. The ROKS nomogram for predicting a second symptomatic stone episode. J Am Soc Nephrol: JASN 2014; 25: 2878–2886. https://doi.org/10.1681/ASN.2013091011

Song

Xue

Zhao

, et al. Short-video apps as a health information source for chronic obstructive pulmonary disease: information quality assessment of TikTok videos. J Med Internet Res 2021; 23: e28318. https://doi.org/10.2196/28318

Wang

Song

, et al. The reliability and quality of short videos as a source of dietary guidance for inflammatory bowel disease: cross-sectional study. J Med Internet Res 2023; 25: e41518. https://doi.org/10.2196/41518

Lin

Zhang

, et al. Hip fractures in chinese TikTok (douyin) short videos: an analysis of information quality, content and user comment attitudes. Front Public Health 2025; 13: 1563188. https://doi.org/10.3389/fpubh.2025.1563188

Suarez-Lledo

Alvarez-Galvez

. Prevalence of health misinformation on social media: systematic review. J Med Internet Res 2021; 23: e17187. https://doi.org/10.2196/17187

Basch

Hillyer

Meleo-Erwin

, et al. Preventive behaviors conveyed on YouTube to mitigate transmission of COVID-19: cross-sectional study. JMIR Public Health Surveillance 2020; 6: e18807. https://doi.org/10.2196/18807

10.

Comp

Dyer

Gottlieb

. Is TikTok the next social media frontier for medicine? AEM Educ Train 2020; 5: e10532. Epub ahead of print July 2021. https://doi.org/10.1002/aet2.10532

11.

Charnock

Shepperd

Needham

, et al. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health 1999; 53: 105–111. https://doi.org/10.1136/jech.53.2.105

12.

Singh

. YouTube for information on rheumatoid arthritis--a wakeup call? J Rheumatol 2012; 39: 899–903. https://doi.org/10.3899/jrheum.111114

13.

Sun

Zheng

. Quality of information in gallstone disease videos on TikTok: cross-sectional study. J Med Internet Res 2023; 25: e39162. https://doi.org/10.2196/39162

14.

Wang

Yao

Wang

, et al. Bilibili, TikTok, and YouTube as sources of information on gastric cancer: assessment and analysis of the content and quality. BMC Public Health 2024; 24: 57. https://doi.org/10.1186/s12889-023-17323-x

15.

Cao

Zhang

Zhu

, et al. Quality of cataract-related videos on TikTok and its influencing factors: a cross-sectional study. DIGITAL HEALTH 2025; 11: 20552076251365086. https://doi.org/10.1177/20552076251365086

16.

Yurdaisik

. Analysis of the most viewed first 50 videos on YouTube about breast cancer. Biomed Res Int 2020; 2020: 2750148. https://doi.org/10.1155/2020/2750148

17.

Wong

HPN

Senthamil

, et al. A cross-sectional quality assessment of TikTok content on benign prostatic hyperplasia. World J Urol 2023; 41: 3051–3057. https://doi.org/10.1007/s00345-023-04601-x

18.

Abramson

Feiertag

Javidi

, et al. Accuracy of prostate cancer screening recommendations for high‐risk populations on YouTube and TikTok. BJUI Compass 2023; 4: 206–213. https://doi.org/10.1002/bco2.200

19.

Zheng

Tong

Wan

, et al. Quality and reliability of liver cancer-related short chinese videos on TikTok and bilibili: cross-sectional content analysis study. J Med Internet Res 2023; 25: e47210. https://doi.org/10.2196/47210

20.

Jiakuan

Chaoxiang

Zhang

, et al. Evaluating the reliability and quality of knee osteoarthritis educational content on TikTok and bilibili: a cross-sectional content analysis. DIGITAL HEALTH 2025; 11: 20552076251366390.

21.

Mueller

Hongler

VNS

Jungo

, et al. Fiction, falsehoods, and few facts: cross-sectional study on the content-related quality of atopic eczema-related videos on YouTube. J Med Internet Res 2020; 22: e15599. https://doi.org/10.2196/15599

22.

Zhou

, et al. The reliability and quality of short videos as health information of guidance for lymphedema: a cross-sectional study. Front Public Health 2025; 12: 1472583. https://doi.org/10.3389/fpubh.2024.1472583

23.

Assimos

Krambeck

Miller

, et al. Surgical management of stones: american urological association/endourological society guideline, PART I. J Urol 2016; 196: 1153–1160. https://doi.org/10.1016/j.juro.2016.05.090

24.

Türk

Petřík

Sarica

, et al. EAU guidelines on diagnosis and conservative management of urolithiasis. Eur Urol 2016; 69: 468–474. https://doi.org/10.1016/j.eururo.2015.07.040

25.

Goobie

Guler

Johannson

, et al. YouTube videos as a source of misinformation on idiopathic pulmonary fibrosis. Ann Am Thorac Soc 2019; 16: 572–579. https://doi.org/10.1513/AnnalsATS.201809-644OC

26.

Chen

Wang

Huang

, et al. The quality and reliability of short videos about thyroid nodules on BiliBili and TikTok: cross-sectional study. DIGITAL HEALTH 2024; 10: 20552076241288831. https://doi.org/10.1177/20552076241288831

27.

Niu

Hao

Yang

, et al. Quality of pancreatic neuroendocrine tumor videos available on TikTok and bilibili: content analysis. JMIR Form Res 2024; 8: e60033. https://doi.org/10.2196/60033

28.

Ren

, et al. The quality and reliability of short videos about premature ovarian failure on bilibili and TikTok: cross-sectional study. DIGITAL HEALTH 2025; 11: 20552076251351077. https://doi.org/10.1177/20552076251351077

29.

McHugh

. Interrater reliability: the kappa statistic. Biochem Med 2012; 22: 276–282.

30.

Liu

Peng

, et al. Assessment of the reliability and quality of breast cancer related videos on TikTok and bilibili: cross-sectional study in China. Front Public Health 2023; 11: 1296386. https://doi.org/10.3389/fpubh.2023.1296386

31.

Hudon

Perry

Plate

A-S

, et al. Navigating the maze of social media disinformation on psychiatric illness and charting paths to reliable information for mental health professionals: observational study of TikTok videos. J Med Internet Res 2025; 27: e64225. https://doi.org/10.2196/64225

32.

Acero

Herrero

Foncham

, et al. Accuracy, quality, and misinformation of YouTube abortion procedural videos: cross-sectional study. J Med Internet Res 2024; 26: e50099. https://doi.org/10.2196/50099

33.

Riemma

Carotenuto

Casolari

, et al. Assessing quality, reliability and accuracy of polycystic ovary syndrome‐related content on TikTok: a video‐based cross‐sectional analysis. Int J Gynecol Obstet 2025; 170: 274–283. https://doi.org/10.1002/ijgo.70007

34.

Peng

, et al. Comparative analysis of NAFLD-related health videos on TikTok: a cross-language study in the USA and China. BMC Public Health 2024; 24: 3375. https://doi.org/10.1186/s12889-024-20851-9

35.

Augustyn

Richards

McGrath

, et al. An examination of the quality of kidney stone information on YouTube and TikTok. Urolithiasis 2025; 53: 1–6. https://doi.org/10.1007/s00240-025-01713-4

36.

Song

Zhang

. Interventions to support consumer evaluation of online health information credibility: a scoping review. Int J Med Inf 2021; 145: 104321. https://doi.org/10.1016/j.ijmedinf.2020.104321

37.

Jiang

Zhou

Qiu

, et al. Search engines and short video apps as sources of information on acute pancreatitis in China: quality assessment and content assessment. Front Public Health 2025; 13: 1578076. https://doi.org/10.3389/fpubh.2025.1578076

38.

Dai

Gong

, et al. The quality and reliability of online videos as an information source of public health education for stroke prevention in mainland china: electronic media–based cross-sectional study. JMIR Infodemiol 2025; 5: e64891. https://doi.org/10.2196/64891

39.

Lin

Zhu

, et al. The most popular videos promoting breast enhancement products on TikTok: cross-sectional content and user engagement analysis. J Med Internet Res 2025; 27: e73336. https://doi.org/10.2196/73336

40.

Zhu

Wang

, et al. Information quality of videos related to esophageal cancer on tiktok, kwai, and bilibili: a cross-sectional study. BMC Public Health 2025; 25: 2245. https://doi.org/10.1186/s12889-025-23475-9

41.

Cuan-Baltazar

Muñoz-Perez

Robledo-Vega

, et al. Misinformation of COVID-19 on the internet: infodemiology study. JMIR Public Health Surveillance 2020; 6: e18444. https://doi.org/10.2196/18444

42.

Liu

Chen

Lin

, et al. YouTube/bilibili/TikTok videos as sources of medical information on laryngeal carcinoma: cross-sectional content analysis study. BMC Public Health 2024; 24: 1594. https://doi.org/10.1186/s12889-024-19077-6

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.17 MB

0.05 MB

0.19 MB

0.00 MB