Sage Journals: Discover world-class research

Abstract

Background

Chimeric antigen receptor T-cell (CAR-T) therapy represents a transformative advancement in cancer treatment. While public interest in CAR-T has surged, particularly through short-video platforms like TikTok and Bilibili in China, concerns remain regarding the reliability and quality of health information disseminated through such media.

Objective

This study aimed to systematically evaluate the content quality, scientific integrity, and user engagement of CAR-T-related videos on Bilibili and TikTok, and to assess whether high traffic equates to high information quality.

Methods

A total of 200 Chinese-language videos (100 per platform) were identified using the keyword “CAR-T.” Videos were evaluated using three scoring tools: the DISCERN instrument for reliability, the Global Quality Score (GQS), and a novel CAR-T-specific checklist assessing 12 core domains. Content characteristics, source types, and engagement metrics (likes, comments, shares, and saves) were also extracted and compared across platforms and content types.

Results

TikTok videos demonstrated significantly higher user engagement but poorer structure and lower DISCERN scores than Bilibili (P < 0.001). Videos posted by medical professionals were more common on TikTok (56%) and had higher engagement, but not necessarily higher quality. Bilibili, dominated by academic sources, produced longer videos with more complete and structured information. Correlation analysis revealed strong consistency among quality scoring tools but weak associations between quality and engagement metrics, suggesting a “high popularity–low quality” paradox.

Conclusion

CAR-T-related content on Chinese short-video platforms is characterized by a disconnect between popularity and information quality. Effective science communication strategies and platform-level interventions are needed to mitigate misinformation risks and improve the dissemination of high-quality medical content.

Keywords

CAR-T therapy short-video platforms TikTok Bilibili health communication content quality DISCERN Global Quality Score

Introduction

Chimeric antigen receptor T-cell (CAR-T) therapy is a revolutionary breakthrough in cancer immunotherapy, fundamentally transforming oncological treatment. This innovative approach involves genetically modifying patients’ T-cells to express synthetic receptors targeting tumor-associated antigens, enhancing immune recognition and elimination of malignant cells.¹ he process includes T-cell collection, ex vivo genetic modification using viral vectors, cell expansion, and patient reinfusion.² This personalized strategy has demonstrated remarkable efficacy in treating previously intractable hematological malignancies. Currently, CAR-T therapy is mainly indicated for relapsed or refractory B-cell acute lymphoblastic leukemia (B-ALL), diffuse large B-cell lymphoma (DLBCL), and multiple myeloma (MM).^3,4 In B-ALL, CAR-T therapy has demonstrated substantial efficacy, particularly in pediatric patients who have failed multiple lines of chemotherapy. Clinical trials have reported complete remission (CR) rates exceeding 80% in relapsed or refractory B-ALL, making CAR-T an important treatment option for this highly aggressive leukemia subtype.³ Similarly, in diffuse large B-cell lymphoma (DLBCL), CAR-T therapy provides meaningful clinical benefit, with response rates ranging from 40% to 70% among patients whose disease has not responded to conventional treatments.⁵ CAR-T therapy has also shown promising results in multiple myeloma. Clinical studies have indicated improved treatment outcomes for patients. FDA-approved products, including tisagenlecleucel and axicabtagene ciloleucel, have established this modality as standard care for specific populations.⁶ FDA-approved products, including tisagenlecleucel and axicabtagene ciloleucel, have established this modality as standard care for specific populations.⁷ However, because of the complexity of CAR-T, comprehensive management throughout the treatment continuum is required, including pre-treatment assessment, manufacturing coordination, bridging therapy, and intensive monitoring for complications such as cytokine release syndrome and immune effector cell-associated neurotoxicity syndrome.^8,9

Despite therapeutic advantages, CAR-T therapy faces significant challenges, including high costs, limited accessibility, severe adverse events, and specialized infrastructure requirements.¹⁰ Long-term efficacy concerns, including antigen escape and T-cell persistence, continue driving research efforts.¹¹ These complexities highlight the critical importance of accurate public education regarding CAR-T therapy.

With the rapid development of social media platforms, short videos have become one of the most popular content formats in China, particularly for disseminating health information. According to the Statistical Report on China's Internet Development (CNNIC's 55th Report), short-video platforms have surpassed the 1 billion user mark, covering nearly all major groups of China's Internet users. Platforms such as TikTok and Bilibili attract distinct audiences: TikTok captivates broad social demographics through its algorithmic recommendations and fast-paced visual content, while Bilibili caters more to younger, knowledge-oriented users. Although platforms such as Kuaishou, WeChat, and Xiaohongshu also hold significant positions in China's Internet landscape, Bilibili and TikTok became the main platforms for this study due to their distinct content ecosystems and user demographics.

With increasing public health awareness and digital media proliferation, short-video platforms such as Bilibili and TikTok have become significant medical information sources.¹² However, democratization of medical information dissemination raises substantial concerns about content quality, scientific accuracy, and misinformation propagation.¹³ Previous research examining health-related content on short-video platforms, including topics of pancreatic cancer and inflammatory bowel disease, consistently revealed moderate to poor information quality.^14,15 Videos produced by healthcare professionals demonstrated superior quality compared to non-medical sources.¹⁶ In this context, health literacy and communication skills are crucial for ensuring that the information disseminated on these platforms is both accurate and easy to understand. Health literacy refers to an individual's ability to access, comprehend, evaluate, and apply health information, and it forms the foundation for patients to correctly understand treatment options and associated risks. With effective communication skills, information providers can convey complex medical knowledge to the public in a clear and accessible manner, thus improving the effectiveness of information delivery. This is particularly important for complex therapies such as CAR-T, where clarity of communication and accuracy of information are essential. In addition, differences in platform dissemination mechanisms, algorithmic recommendations, and user engagement patterns may influence the quality of information diffusion and contribute to the spread of misleading content.

Given CAR-T therapy's growing clinical prominence and increasing popular media representation, substantial CAR-T-related content exists on short-video platforms with uncertain quality and reliability. Due to its complex nature, CAR-T therapy is particularly susceptible to oversimplification and misrepresentation. Misinformation consequences are concerning, as inaccurate information could influence patient decisions, create unrealistic expectations, or delay appropriate interventions. Therefore, this study systematically investigates the content quality and reliability of CAR-T-related educational videos on BiliBili and TikTok. We used three assessment tools: the Global Quality Score (GQS), DISCERN instrument, and a novel CAR-T-specific scoring checklist evaluating twelve critical domains unique to CAR-T therapy. The DISCERN instrument is a widely used tool for evaluating the reliability and quality of written consumer health information, particularly regarding treatment options.¹⁷ It assesses factors such as structure, source credibility, and balance of information. It has been applied in various healthcare studies, including cancer-related content, to measure the accuracy and comprehensiveness of information. The GQS is another assessment tool used to evaluate the overall quality of health-related videos, considering clarity, completeness, and educational value.¹⁸ It has been utilized in studies assessing medical videos on platforms such as YouTube and TikTok, and is effective for determining how well information is communicated to the public. This tri-dimensional approach provides a comprehensive evaluation of scientific accuracy, clinical completeness, and educational value. Additionally, we analyzed relationships between video engagement metrics and content quality to determine whether popularity is correlated with educational value.

This study aims to quantitatively analyze and evaluate the quality of CAR-T-related content disseminated on short-video platforms, as well as the relationship between platform engagement metrics (e.g. likes, comments, and shares) and content quality. With the rapid development of social media platforms, short videos have become an important tool for disseminating health information, particularly for complex and highly specialized medical topics such as CAR-T therapy. The purpose of this study is to assess the effectiveness and scientific rigor of CAR-T information communicated via short-video platforms and explore how factors such as platform popularity and engagement patterns may be related to content quality.

Method

Research design

In this study, a cross-sectional content analysis approach is used to systematically evaluate the quality and scientific accuracy of video content related to CAR-T therapy on China's mainstream short-video platforms (Bilibili and TikTok). The research design adheres to the STROBE statement to ensure systematic and standardized research procedures. Given the highly dynamic nature and frequent updates of content on short-video platforms, this study focuses on video samples captured at specific points in time to reflect the characteristics of the content ecosystem in that phase. It is not a long-term tracking study with repeatable data collection capabilities.

Video screening and inclusion criteria

The research team conducted searches using “CAR-T” as the keyword on both Bilibili and TikTok platforms. The screening period ended on 1 June 2025. This date was selected to ensure that both platforms were evaluated at the same time point. To enhance the representativeness of the sample, 100 relevant videos were included from each platform, totaling 200 videos. Inclusion criteria include (1) the video must focus on CAR-T therapy as its core theme. The content has a degree of scientific literacy or educational value; (2) the language must be Chinese; (3) the video must be at least 15 seconds in length and fully playable. Exclusion criteria include content unrelated to CAR-T therapy, videos consisting solely of music or image collages, purely visual presentations without audio or subtitles, and content that is clearly commercial advertising or promotional material.

Variable extraction and classification criteria

All selected videos underwent independent variable extraction and categorization by two researchers, including platform source (Bilibili or TikTok), release date (year), video duration (seconds), and user engagement metrics (likes, comments, saves, and shares). Additionally, researchers divided video publishers into five groups: medical professionals (including clinicians such as doctors, nurses, and pharmacists, or individuals affiliated with healthcare institutions), news media, non-professional creators, research institutions, and researchers (including those mainly affiliated with universities or research institutes, focusing on academic and research-oriented content). Content types were classified into two categories: “case studies” and “science popularization.” In case of disagreement over variable categories, a third researcher made the final review.

Content quality assessment tool

To comprehensively evaluate the scientific validity, structural integrity, and dissemination quality of the video, the study employs the following three tools:

First, the DISCERN tool is used to assess the reliability of the treatment information, covering dimensions such as structure, source, and balance, with a five-level rating scale.¹⁷ Second, the GQS is applied to subjectively evaluate the overall dissemination quality of videos on a scale from 0 to 4.¹⁹ Higher scores indicate clearer video structure, more complete content, and greater educational value. Third, the research team developed a dedicated content quality scoring system for CAR-T therapy, covering 12 core knowledge domains (such as treatment mechanisms, indications, efficacy, safety management, and treatment procedures). Each domain is worth 1 point, with a maximum total score of 12 points (see Supplemental Table 1). This system measures the completeness and depth of professional information presented in videos.

Rating process and consistency assessment

Two medically trained researchers who received standardized training independently rated each video using three assessment tools: the GQS, the DISCERN instrument, and the self-developed CAR-T therapy evaluation form. Clear reference standards were provided prior to scoring, and Cohen's kappa coefficient was used to assess inter-rater reliability after scoring. In case of disagreement, the two raters first discussed to reach a consensus, and a third researcher reviewed the case and made the final determination if a consensus could not be reached.

Statistical analysis methods

All statistical analyses in this study were made by using R software (Version 4.3.0). Continuous variables with skewed distributions were expressed as median and interquartile range [M (Q₁, Q₃)]. Intergroup comparisons were made by using the Mann–Whitney U-test (two groups) or the Kruskal-Wallis H-test (multiple groups). Categorical variables are presented as frequencies and percentages [n (%)] and compared between groups using the chi-square test or Fisher's exact test. Correlation analysis between variables was performed using Spearman's rank correlation analysis. All tests were two-sided, and P < 0.05 was considered statistically significant. To preserve the sensitivity of exploratory analysis, post-hoc comparisons were not Bonferroni-corrected. Data visualization charts were generated using Origin 2021 software.

Result

Information quality of CAR-T videos on TikTok and Bilibili

In DISCERN ratings (Figure 1(a)), both platforms showed a tendency toward central concentration. However, TikTok featured a higher proportion of 1-point videos, with a distribution skewed toward lower scores, while BiliBili concentrated between 2 and 3 points. Statistical analysis revealed that 1-point videos accounted for 30.0% of TikTok content, compared to only 6.0% on BiliBili, representing a significant difference (χ² = 24.06, P < 0.001). In the GQS scores (Figure 1(b)), although the overall distribution across both platforms was similar, the BiliBili platform showed a slight tendency toward the range of 2–3 points, while the TikTok platform had greater dispersion. Statistically, the difference in GQS scores between the two platforms was not significant (χ² = 7.97, P = 0.093), indicating that there was no clear superiority between the platforms, while the overall dissemination quality fluctuated.

Figure 1.

Distribution of quality scores for chimeric antigen receptor T-cell (CAR-T) videos across different platforms.

Regarding CAR-T guideline scores (Figure 1(c)), videos on both platforms predominantly scored in the low range of 0–3 points, with very few high-scoring videos. Bilibili showed slightly more high-scoring cases (e.g. scores of 5–6), while TikTok scores were more concentrated with a narrower distribution. Despite slight overall differences, the variation in CAR-T content completeness scores between platforms was not statistically significant (P = 0.145). In summary, Figure 2 reveals that TikTok has a significant number of videos with low ratings, particularly performing poorly in terms of the DISCERN criteria for medical information. Meanwhile, Bilibili demonstrates a slight advantage in the completeness of academic content.

Figure 2.

Correlation analysis of key variables in chimeric antigen receptor T-cell (CAR-T) videos across Bilibili (a) and TikTok (b) platforms.

Platform-based differences in video characteristics and user engagement

This study included a total of 200 short videos, with 100 videos each from the Bilibili and TikTok platforms. Videos on the TikTok platform were posted at a later time (2024 vs. 2023, Z = −5.20, P < 0.001), but their duration was significantly shorter (102.05 seconds vs. 430.50 seconds, Z = −9.56, P < 0.001). Regarding user engagement metrics, TikTok's platform demonstrated that significantly higher numbers of comments (59.50 vs. 1.50), likes (534.50 vs. 30.50), saves (141.00 vs. 64.50), and shares (145.00 vs. 15.50) were significantly higher than those on Bilibili, with statistically significant differences (P-values < 0.001 for all). Regarding content sources, videos posted by medical professionals accounted for 56.0% of TikTok's content, significantly higher than Bilibili's 4.0%. However, Bilibili featured content mainly from research institutions (26.0%) and researchers (22.0%). The composition of content sources differed significantly between platforms (χ² = 80.00, P < 0.001). Case-based videos also had a higher proportion on TikTok (20.0% vs. 9.0%, P = 0.027). Regarding video quality, TikTok had a significantly higher proportion of videos scoring 1 on the DISCERN scale (30.0% vs. 6.0%, P < 0.001). However, no statistically significant difference was observed between platforms for GQS scores (P = 0.093) or CAR-T-specific scores (P = 0.145) (Table 1). The inter-rater reliability results showed good agreement between the two raters across all three assessment tools: GQS κ = 0.794, DISCERN κ = 0.776, and the self-developed 12-domain tool κ = 0.753.

Table 1.

Comparison of content characteristics and quality scores for CAR-T-related short videos across platforms (Bilibili vs. TikTok).

Variables	Total (n = 200)	B (n = 100)	T (n = 100)	Statistic	P
Year, M (Q₁, Q₃)	2024.00 (2022.00, 2025.00)	2023.00 (2022.00, 2024.00)	2024.00 (2023.00, 2025.00)	Z = −5.20	<0.001
Comment, M (Q₁, Q₃)	14.50 (1.00, 91.25)	1.50 (0.00, 14.50)	59.50 (14.75, 190.00)	Z = −8.21	<0.001
Like, M (Q₁, Q₃)	169.50 (25.00, 699.75)	30.50 (8.00, 149.00)	534.50 (207.00, 1958.50)	Z = −7.93	<0.001
Collect, M (Q₁, Q₃)	95.00 (23.00, 352.75)	64.50 (15.75, 222.50)	141.00 (61.75, 431.00)	Z = −3.52	<0.001
Share, M (Q₁, Q₃)	56.00 (10.75, 262.00)	15.50 (4.00, 60.50)	145.00 (47.00, 677.75)	Z = −6.53	<0.001
Time, M (Q₁, Q₃)	166.16 (93.00, 442.63)	430.50 (224.25, 1517.75)	102.05 (59.70, 147.75)	Z = −9.56	<0.001
Source, n (%)				χ² = 80.00	<0.001
Medical personnel	60 (30.00)	4 (4.00)	56 (56.00)
News media	19 (9.50)	7 (7.00)	12 (12.00)
Non-professional personnel	65 (32.50)	41 (41.00)	24 (24.00)
Research institute	32 (16.00)	26 (26.00)	6 (6.00)
Researcher	24 (12.00)	22 (22.00)	2 (2.00)
Content, n (%)				χ² = 4.88	0.027
Case report	29 (14.50)	9 (9.00)	20 (20.00)
Science popularization	171 (85.50)	91 (91.00)	80 (80.00)
DISCREM, n (%)				χ² = 24.06	<0.001
1	36 (18.00)	6 (6.00)	30 (30.00)
2	79 (39.50)	44 (44.00)	35 (35.00)
3	68 (34.00)	38 (38.00)	30 (30.00)
4	11 (5.50)	6 (6.00)	5 (5.00)
5	6 (3.00)	6 (6.00)	0 (0.00)
GQS, n (%)				χ² = 7.97	0.093
0	2 (1.00)	2 (2.00)	0 (0.00)
1	40 (20.00)	25 (25.00)	15 (15.00)
2	109 (54.50)	54 (54.00)	55 (55.00)
3	35 (17.50)	12 (12.00)	23 (23.00)
4	14 (7.00)	7 (7.00)	7 (7.00)
CAR-T guidelines, n (%)				–	0.145
0	23 (11.50)	9 (9.00)	14 (14.00)
1	68 (34.00)	28 (28.00)	40 (40.00)
2	58 (29.00)	35 (35.00)	23 (23.00)
3	25 (12.50)	10 (10.00)	15 (15.00)
4	12 (6.00)	8 (8.00)	4 (4.00)
5–8	14 (4.00)	10 (10.00)	4 (4.00)

Z: Mann-Whitney test; χ²: chi-square test; –: Fisher's exact.

M: median; Q₁: first quartile; Q₃: third quartile; CAR-T: chimeric antigen receptor T-cell; GQS: Global Quality Score.

Content type and publisher source distribution across platforms

In terms of content types, Bilibili mainly features science popularization videos, accounting for 91% of its content, while case report videos constitute only 9% (Figure 3(a)). On TikTok, science popularization videos make up 80% of the content (Figure 3(c)), with case-related videos rising to 20%. The difference in content composition between the two platforms is statistically significant (χ² = 4.88, P = 0.027), indicating that TikTok tends to disseminate more content related to clinical cases.

Figure 3.

Distribution of chimeric antigen receptor T-cell (CAR-T)-related videos on Bilibili and TikTok platforms by content type (a, c) and publisher source (b, d).

Regarding the sources of video publishers, Bilibili mainly features non-professionals (41.0%), followed by research institutions (26.0%) and researchers (22.0%), with medical professionals accounting for only 4.0% (Figure 3(b)). In contrast, TikTok videos are predominantly published by medical professionals, accounting for 49.1%, followed by non-professionals (21.1%), research institutions (17.5%), researchers (10.5%), and news media (1.8%) (Figure 3(d)). The two platforms have significant differences in content source composition (χ² = 80.00, P < 0.001), indicating that TikTok content is more concentrated among healthcare practitioners, while Bilibili features more diverse content sources, predominantly originating from institutional accounts.

Variation in content sources between TikTok and Bilibili

The source of videos significantly affects their performance and quality. Videos posted by medical professionals were predominantly concentrated on the TikTok platform (93.33%), had the latest posting dates (2025), and featured the shortest video duration (94.45 seconds). However, they achieved the highest user engagement metrics (median likes: 422.00; median comments: 45.00), with all differences being statistically significant (P < 0.001). Researchers and research institutions posted longer videos (853.00 and 301.50 seconds, respectively), but their like and comment counts were significantly lower than those of healthcare professionals. Videos from different sources showed significant differences in GQS scores (P = 0.007). Healthcare professionals’ videos received higher scores, while news media and non-professionals’ videos clustered around GQS scores of 1–2. DISCERN and CAR-T scores showed no significant difference across sources (P = 0.730 and P = 0.971, respectively) (Table 2).

Table 2.

Comparison of video performance and quality scores across different source types.

Variables	Total (n = 200)	Medical personnel (n = 60)	News media (n = 19)	Non-professional personnel (n = 65)	Research institute (n = 32)	Researcher (n = 24)	Statistic	P
Year, M (Q₁, Q₃)	2024.00 (2022.00, 2025.00)	2025.00 (2023.00, 2025.00)	2023.00 (2023.00, 2025.00)	2023.00 (2022.00, 2024.00)	2023.00 (2021.75, 2025.00)	2023.00 (2022.00, 2023.25)	χ² = 19.97^#	<0.001
Comment, M (Q₁, Q₃)	14.50 (1.00, 91.25)	45.00 (8.75, 151.75)	10.00 (1.50, 18.00)	35.00 (1.00, 168.00)	2.00 (0.00, 11.00)	2.00 (0.00, 25.25)	χ² = 33.80#	<0.001
Like, M (Q₁, Q₃)	169.50 (25.00, 699.75)	422.00 (150.75, 1479.75)	95.00 (26.50, 768.50)	197.00 (31.00, 889.00)	18.00 (9.75, 104.75)	35.00 (4.00, 185.75)	χ² = 34.23^#	<0.001
Collect, M (Q₁, Q₃)	95.00 (23.00, 352.75)	135.00 (60.25, 363.00)	62.00 (12.50, 297.50)	100.00 (31.00, 526.00)	61.00 (15.50, 149.75)	82.50 (11.75, 223.75)	χ² = 9.27^#	0.055
Share, M (Q₁, Q₃)	56.00 (10.75, 262.00)	120.50 (39.50, 355.00)	73.00 (10.50, 457.00)	58.00 (11.00, 343.00)	11.00 (7.50, 61.75)	17.00 (3.00, 58.25)	χ² = 24.90^#	<0.001
Time, M (Q₁, Q₃)	166.16 (93.00, 442.63)	94.45 (59.33, 144.79)	114.62 (101.78, 266.00)	184.44 (114.08, 412.00)	301.50 (137.89, 1659.25)	853.00 (287.50, 2050.50)	χ² = 57.87^#	<0.001
Platform, n (%)							χ² = 80.00	<0.001
B	100 (50.00)	4 (6.67)	7 (36.84)	41 (63.08)	26 (81.25)	22 (91.67)
T	100 (50.00)	56 (93.33)	12 (63.16)	24 (36.92)	6 (18.75)	2 (8.33)
DESCREM, n (%)							–	0.730*
1	36 (18.00)	13 (21.67)	4 (21.05)	13 (20.00)	4 (12.50)	2 (8.33)
2	79 (39.50)	17 (28.33)	6 (31.58)	27 (41.54)	17 (53.12)	12 (50.00)
3	68 (34.00)	25 (41.67)	8 (42.11)	19 (29.23)	8 (25.00)	8 (33.33)
4	11 (5.50)	3 (5.00)	1 (5.26)	3 (4.62)	2 (6.25)	2 (8.33)
5	6 (3.00)	2 (3.33)	0 (0.00)	3 (4.62)	1 (3.12)	0 (0.00)
GQS, n (%)							–	0.007*
0	2 (1.00)	0 (0.00)	1 (5.26)	1 (1.54)	0 (0.00)	0 (0.00)
1	40 (20.00)	7 (11.67)	3 (15.79)	19 (29.23)	6 (18.75)	5 (20.83)
2	109 (54.50)	27 (45.00)	9 (47.37)	35 (53.85)	22 (68.75)	16 (66.67)
3	35 (17.50)	18 (30.00)	6 (31.58)	6 (9.23)	2 (6.25)	3 (12.50)
4	14 (7.00)	8 (13.33)	0 (0.00)	4 (6.15)	2 (6.25)	0 (0.00)
CAR-T guidelines, n (%)							–	0.971*
0	23 (11.50)	5 (8.33)	4 (21.05)	9 (13.85)	2 (6.25)	3 (12.50)
1	68 (34.00)	22 (36.67)	6 (31.58)	20 (30.77)	13 (40.62)	7 (29.17)
2	58 (29.00)	16 (26.67)	5 (26.32)	19 (29.23)	9 (28.12)	9 (37.50)
3	25 (12.50)	8 (13.33)	2 (10.53)	9 (13.85)	4 (12.50)	2 (8.33)
4	12 (6.00)	4 (6.67)	2 (10.53)	2 (3.08)	2 (6.25)	2 (8.33)
5–8	14 (7.00)	5 (8.34)	0 (0.00)	6 (9.23)	2 (6.25)	1 (4.17)

: Kruskal-waills test; χ²: chi-square test; –: Fisher’s exact; *: Simulated P-value.

M: median; Q₁: first quartile, Q₃: third quartile; CAR-T: chimeric antigen receptor T-cell; GQS: Global Quality Score.

Impact of publisher type on video quality and performance

Case report videos (n = 29) significantly outperformed science popularization videos (n = 171) in terms of interactivity. Specifically, the median number of comments was 88.00, significantly higher than 10.00 for science popularization videos (Z = −3.13, P = 0.002). The number of likes was also higher (median of 528.00 vs. 138.00), with the difference approaching significance (Z = −1.89, P = 0.058). There was no statistically significant difference in video duration, number of favorites, or number of shares between the two categories of videos. Source distribution indicated that non-professionals accounted for the highest proportion (51.72%) in case-based videos, while science-popularization videos were predominantly created by medical professionals and researchers. Despite differences in interactivity, DISCERN scores, GQS scores, and CAR-T scores showed no statistically significant difference between the two video types (all P values > 0.05), suggesting that video presentation formats have a limited impact on quality assessment (Table 3).

Table 3.

Comparison of user engagement and information quality in CAR-T short videos across different content types (case reports vs. science popularization).

Variables	Total (n = 200)	Case report (n = 29)	Science popularization (n = 171)	Statistic	P
Year, M (Q₁, Q₃)	2024.00 (2022.00, 2025.00)	2024.00 (2023.00, 2025.00)	2023.00 (2022.00, 2025.00)	Z = −2.24	0.025
Comment, M (Q₁, Q₃)	14.50 (1.00, 91.25)	88.00 (12.00, 336.00)	10.00 (1.00, 62.50)	Z = −3.13	0.002
Like, M (Q₁, Q₃)	169.50 (25.00, 699.75)	528.00 (79.00, 1941.00)	138.00 (23.00, 636.00)	Z = −1.89	0.058
Collect, M (Q₁, Q₃)	95.00 (23.00, 352.75)	85.00 (18.00, 260.00)	95.00 (26.00, 361.50)	Z = −0.69	0.489
Share, M (Q₁, Q₃)	56.00 (10.75, 262.00)	93.00 (16.00, 206.00)	53.00 (10.00, 270.00)	Z = −0.15	0.877
Time, M (Q₁, Q₃)	166.16 (93.00, 442.63)	145.70 (87.70, 219.00)	177.00 (93.80, 531.00)	Z = −1.34	0.179
Platform, n (%)				χ² = 4.88	0.027
B	100 (50.00)	9 (31.03)	91 (53.22)
T	100 (50.00)	20 (68.97)	80 (46.78)
Source, n (%)				–	<0.001
Medical personnel	60 (30.00)	4 (13.79)	56 (32.75)
News media	19 (9.50)	7 (24.14)	12 (7.02)
Non-professional personnel	65 (32.50)	15 (51.72)	50 (29.24)
Research institute	32 (16.00)	3 (10.34)	29 (16.96)
Researcher	24 (12.00)	0 (0.00)	24 (14.04)
DESCREM, n (%)				χ² = 5.44	0.245
1	36 (18.00)	8 (27.59)	28 (16.37)
2	79 (39.50)	14 (48.28)	65 (38.01)
3	68 (34.00)	6 (20.69)	62 (36.26)
4	11 (5.50)	1 (3.45)	10 (5.85)
5	6 (3.00)	0 (0.00)	6 (3.51)
GQS, n (%)				–	0.711
0	2 (1.00)	0 (0.00)	2 (1.17)
1	40 (20.00)	6 (20.69)	34 (19.88)
2	109 (54.50)	19 (65.52)	90 (52.63)
3	35 (17.50)	3 (10.34)	32 (18.71)
4	14 (7.00)	1 (3.45)	13 (7.60)
CAR-T guidelines, n (%)				–	0.624
0	23 (11.50)	7 (24.14)	16 (9.36)
1	68 (34.00)	8 (27.59)	60 (35.09)
2	58 (29.00)	8 (27.59)	50 (29.24)
3	25 (12.50)	3 (10.34)	22 (12.87)
4	12 (6.00)	2 (6.90)	10 (5.85)
5–8	14 (7.00)	1 (3.45)	13 (7.60)

Z: Mann-Whitney test; χ²: chi-square test; –: Fisher’s exact.

M: median; Q₁: first quartile; Q₃: third quartile; CAR-T: chimeric antigen receptor T-cell; GQS: Global Quality Score.

Correlation analysis: Consistency among quality metrics and decoupling from engagement

On the Bilibili platform (Figure 2(a)), significant positive correlations exist among the three content quality scores: DISCERN and GQS (r = 0.77), DISCERN and CAR-T score (r = 0.67), and GQS and CAR-T score (r = 0.57). This indicates that higher structural scores are correlated with higher overall dissemination quality and content integrity scores. Additionally, user interaction metrics showed strong correlations, such as likes and bookmarks (r = 0.93), bookmarks and shares (r = 0.91), and likes and comments (r = 0.82). However, the correlation between content quality scores and engagement metrics was generally weak. For instance, DISCERN showed a low correlation with likes (r = 0.26), GQS with comments (r = 0.13), and CAR-T scores with shares (r = 0.19). This indicates that user engagement levels did not have a significant positive correlation with information quality.

On the TikTok platform (Figure 2(b)), DISCERN scores were also highly correlated with GQS scores (r = 0.77), while moderate positive correlations were observed between DISCERN and CAR-T scores (r = 0.66). Correlations among user engagement metrics were even more pronounced: Likes and Bookmarks (r = 0.94), Likes and Comments (r = 0.89), and Shares and Bookmarks (r = 0.92) all demonstrated strong correlations. However, similar to Bilibili, the correlation between interaction metrics and quality scores on TikTok is generally weak. For instance, the CAR-T score showed no significant correlation with likes (r = –0.16), the GQS score with comments (r = –0.12), and the DISCERN score with shares (r = 0.08). Overall, both platforms demonstrate a high degree of consistency among content quality metrics, while content quality is decoupled from platform engagement metrics, suggesting that high interaction does not necessarily equate to high quality.

Discussion

In this study, 200 videos related to CAR-T therapy across two major Chinese short-video platforms (Bilibili and TikTok) were systematically analyzed. Content quality and interaction characteristics across multiple dimensions—including structural integrity, scientific accuracy, and dissemination performance—were evaluated, while potential associations between platform features, publisher background, content type, and video quality were explored. The findings reveal that TikTok demonstrates stronger performance in user engagement metrics. Its videos receive higher rates of likes, comments, saves, and shares. Differences in engagement metrics between TikTok (Douyin) and Bilibili may be related to differences in the overall popularity of the two platforms to some extent. In general, TikTok has a broader user reach, higher day-to-day usage frequency, and greater overall activity; therefore, under the same retrieval conditions, relevant videos are more likely to obtain higher levels of engagement, such as likes, comments, saves, and shares. By contrast, Bilibili differs in user scale and typical usage scenarios, and the pace and magnitude of engagement accumulation may be lower accordingly. However, its score is significantly lower than that of Bilibili in structural evaluation (DISCERN), reflecting a distinct disparity between the two platforms in balancing “popularity” and “content quality.” Simultaneously, the three scoring tools demonstrated strong consistency among themselves, while showing generally weak correlations with engagement metrics. This suggests a disconnect between user behavior and the scientific rigor of the videos.

Although the TikTok platform has a higher user reach and recommendation efficiency,²⁰ this study indicates that its medical videos generally perform poorly in terms of structure, balance, and labeling of information sources. The TikTok platform emphasizes short duration, fast-paced rhythms, and emotional appeal, encouraging creators to rapidly capture user attention through visual impact and sensory stimulation.²¹ This content ecosystem has diminished the demands for logical coherence and completeness in medical communication to some extent. As a highly complex precision immunotherapy, CAR-T therapy involves a multi-stage process including cell collection, genetic engineering, in vitro expansion, and clinical infusion.²² It requires detailed disclosure of its indications, evidence of efficacy, adverse reactions, and long-term follow-up management. Clearly, relying solely on content within tens of seconds or even a minute is insufficient for effective presentation, and is highly likely to result in fragmented knowledge or even misleading expressions. It is noteworthy that medical professionals post a higher proportion of videos on TikTok—which, by common sense, should indicate greater content expertise—but their content did not outperform that published by researchers or institutions in terms of ratings. This may come from healthcare practitioners’ tendency to communicate through case narratives and experience sharing. While such approaches align more closely with the platform's context and resonate easily with audiences, they often overlook the integrity of information structure and evidence chains. For instance, some videos focus on demonstrating efficacy or showing patient testimonials without accompanying explanations of risk management or indication limitations, potentially misleading audiences into believing that they represent “cure-all therapies.” This phenomenon illustrates that people with a medical background do not necessarily have effective scientific communication skills, particularly in fast-paced, algorithm-driven new media platforms.

In this study, it was also observed that video creators on the Bilibili platform tend to be from research institutions and scientists. Their videos are significantly longer, convey information more comprehensively, and feature clearer content structures with stronger logical coherence. This may be related to the platform's audience composition and content ecosystem. Bilibili users tend to be younger overall and show a higher proportion of interest in knowledge-based content, and the platform supports longer-duration video distribution. This encourages content creators to adopt formats such as lectures and PowerPoint presentations to systematically explain complex medical topics. Consequently, they achieve better scores on both the DISCERN and CAR-T assessments. Unlike TikTok's pursuit of an “instant-scroll, instant-watch” dissemination logic, Bilibili is better suited to hosting structured, highly expandable science popularization content. Additionally, in terms of content format, while case-based videos demonstrate higher user engagement metrics such as likes and comments, they do not show superior performance on DISCERN, GQS, or CAR-T scores. This suggests a tension between dissemination effectiveness and information quality. Case videos build empathy pathways through “personal narratives + authentic footage,” readily triggering user emotions and interaction intent. However, they often overlook standardized treatment protocols and evidence bases. Given the complexity of CAR-T therapy and the risk of serious toxicities (e.g. CRS and ICANS), narratives that overemphasize benefits while omitting eligibility boundaries and risk information may contribute to unrealistic expectations and potentially influence inappropriate care-seeking intentions. Therefore, balanced risk communication is essential. In addition, CAR-T therapy is mainly applied in adult and older patient populations, and audience composition may influence engagement patterns on different platforms. Available national reports indicate that older adults constitute a growing segment of Chinese Internet users and increasingly use short-video applications; and such demographic trends may further amplify engagement on mass-market platforms.²³ In contrast, Bilibili's audience is relatively younger and more knowledge-oriented, which may favor structured explanations but not necessarily maximize interaction counts. Therefore, platform engagement differences should be interpreted in the context of differing user demographics and platform popularity.

Another noteworthy phenomenon is the high positive correlation among the three scoring tools. This suggests that they collectively reflect the comprehensive scientific rigor of videos effectively, whether it's the structural dimension of DISCERN, the overall communication quality assessment of GQS, or the professional dimension in the CAR-T custom scoring. However, the lack of strong correlation between these scores and user behavior metrics indicates that users are more likely to be driven by emotion than guided by information accuracy. This finding is supported by prior research. For instance, Mueller et al.'s¹⁶ analysis of YouTube health videos indicates that audiences lack scientific discernment regarding video quality and are often drawn to content with high emotional impact and low structural coherence. This ultimately leads to the widespread dissemination of misinformation about health. Meanwhile, the increasing prevalence of AI-generated content and commercially driven “engagement farming” behaviors on global video platforms may also distort observed engagement metrics. Because we were unable to distinguish authentic user interactions from potentially automated or incentivized engagement, the relationship between popularity indicators and educational value should be interpreted cautiously.

This study reveals a structural issue of “high popularity but low quality” in the dissemination of CAR-T therapy information on Chinese short-video platforms, which has important practical significance. At present, China places great emphasis on medical science communication, as highlighted in policy documents such as The Outline of the Healthy China 2030 Plan,²⁴ which calls for improving public health literacy and strengthening the supply and dissemination capacity of authoritative medical information. At the same time, recent regulatory initiatives have also emphasized strengthening governance of online medical science popularization, including credential verification for medical-content accounts and stricter oversight of misleading, commercialized, or fabricated medical information. These policy trends align with our findings and support the necessity of platform-level quality assurance mechanisms for high-risk medical topics. Under this background, we found that user interaction indicators on platforms (such as likes, comments, and shares) lack consistency with the scientific rigor and structural completeness of video content. The “popularity-first” tendency of algorithmic recommendation logic can easily obscure risks of misinformation or even misleading content. To address this, we recommend that platforms should establish systematic medical content quality evaluation mechanisms and explore approaches such as “health information quality labels,” “authoritative certification marks,” and “trustworthy content recommendations” to guide users to identify high-quality medical science information. At the same time, medical institutions, professional societies, and other authoritative entities should be encouraged to actively engage on these platforms, improving the accessibility of authoritative information through expert science communication and serial explanatory content. Medical professionals should also raise their communication awareness by transforming complex medical knowledge into content that is clearly structured and appropriately expressed, thus improving their ability to communicate with the public. It is suggested that “science communication skills” should be incorporated into continuing medical education and postgraduate training to improve overall communication literacy. It is worth emphasizing that the CAR-T custom scoring tool constructed in this study systematically integrates key dimensions such as treatment process, mechanisms, indications, and adverse reactions into the short-video quality evaluation framework for the first time. This tool has significant practical value and expansion potential. It can be applied to more frontier therapies and rare disease science communication in the future, providing methodological support for the policy-level promotion of standardized scientific communication.

To ensure the objectivity and scientific rigor of our findings, we must also acknowledge certain limitations encountered in the design and execution of this study. In this study, a cross-sectional content analysis is used, reflecting only the quality status of CAR-T-related short videos at a specific point in time prior to 2025. Therefore, it is unable to capture dynamic trends in content quality over time. Given the high update frequency and rapid information flow on short-video platforms, in future research, we could adopt a longitudinal design to monitor the evolution of content structures across different phases and assess the impact of algorithmic recommendations on the distribution of content quality. Secondly, although dual independent scoring with consistency checks is used in this study, content quality assessments remain inevitably subject to subjective interference. Particularly, the GQS and CAR-T scoring dimensions have interpretive flexibility, potentially influenced by evaluators’ different perceptions of content style and expression. Future research could incorporate AI-assisted scoring mechanisms, such as using large language models (LLMs) to perform structural semantic analysis on video captions or text, thus enhancing scoring standardization. Third, although the self-developed CAR-T therapy Evaluation Form was constructed based on key points emphasized in current guidelines and expert consensus to enhance the coverage and comparability of CAR-T-specific content, we have not yet conducted more systematic psychometric validation (e.g. construct validity, criterion validity, internal consistency, and test–retest reliability). In future studies, we could further validate and refine this instrument using larger sample sizes, additional raters, and multicenter settings. Fourth, this study focuses on video content features and has not yet integrated social dissemination factors such as comment section analysis, user sharing pathways, or account characteristics. These external variables may significantly influence the content's “impact” and “misleading nature,” carrying substantial implications for the overall health information ecosystem. Finally, in this study, only two major platforms—TikTok and Bilibili—were analyzed, which are while representative but do not cover other platforms in the Chinese Internet ecosystem, such as WeChat Video, Xiaohongshu, and Kuaishou. Differences exist across platforms in algorithmic mechanisms, user demographics, and content moderation systems, necessitating further comparative analysis of their information quality and dissemination mechanisms.

Conclusion

This study reveals a structural discrepancy of “high engagement—low quality” in the dissemination of CAR-T therapy-related content on Chinese short-video platforms, emphasizing that user interaction metrics cannot accurately reflect the scientific rigor of medical information. While TikTok offers extensive reach, it falls short in information structure and content integrity. In contrast, Bilibili demonstrates advantages in supporting structured expression and improving information depth. Future medical communication must strike a balance between “logical expression” and “professional standards.” Platforms should also jointly intervene through algorithmic and institutional measures to co-create a higher-quality health communication ecosystem.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076261429670 - Supplemental material for Does popularity reflect quality? A study on the reliability and quality of CAR-T videos on Chinese short-video platforms

Supplemental material, sj-docx-1-dhj-10.1177_20552076261429670 for Does popularity reflect quality? A study on the reliability and quality of CAR-T videos on Chinese short-video platforms by Simeng Gao, Jingru Han, Yan Zhang, Jie Liao, Jiayi Yang, Yang Zhao, Linhao Xie, Min Su and Jianfu Zhao in DIGITAL HEALTH

Footnotes

Ethical approval and consent to participate

This study analyzed only publicly accessible video content and publicly visible engagement metrics on Bilibili and TikTok. We did not contact users, conduct any intervention, or collect/record/store any personally identifiable information (e.g. usernames, user IDs, profile links, and contact information). Any identifiers visible on the platforms were not extracted or reported. Results are presented only in aggregated form. Because the study used publicly available data without collecting identifiable private information, formal ethical approval and individual informed consent were not required. Data collection and use complied with the platforms’ terms of service and relevant ethical guidance for internet research.

ORCID iDs

Jingru Han

Jianfu Zhao

Consent to publication

Not applicable.

Author contributions

Simeng Gao, Jingru Han, and Yan Zhang: methodology, formal analysis, and writing–original draft; Yang Zhao, Jiayi Yang, Min Su, Linhao Xie, and Jianfu Zhao: conceptualization, methodology, supervision, and writing–review and editing. All authors read and approved the final manuscript.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

All data supporting the findings of this study are included within the article and its files. In cases where specific original data are not provided, the corresponding author will provide them upon reasonable request.

Supplemental material

Supplemental material for this article is available online.

References

June

O’Connor

Kawalekar

, et al. CAR T cell immunotherapy for human cancer. Science 2018; 359: 1361–1365.

Sterner

. CAR-T cell therapy: current limitations and potential strategies. Blood Cancer J 2021; 11: 69.

Maude

Laetsch

Buechner

, et al. Tisagenlecleucel in children and young adults with B-cell lymphoblastic leukemia. N Engl J Med 2018; 378: 439–448.

Neelapu

Locke

Bartlett

, et al. Axicabtagene ciloleucel CAR T-cell therapy in refractory large B-cell lymphoma. N Engl J Med 2017; 377: 2531–2544.

Schuster

Svoboda

Chong

, et al. Chimeric antigen receptor T cells in refractory B-cell lymphomas. N Engl J Med 2017; 377: 2545–2554.

Berdeja

Madduri

Usmani

, et al. Ciltacabtagene autoleucel, a B-cell maturation antigen-directed chimeric antigen receptor T-cell therapy in patients with relapsed or refractory multiple myeloma (CARTITUDE-1): a phase 1b/2 open-label study. Lancet. London, England 2021; 398: 314–324.

Wang

Munoz

Goy

, et al. KTE-X19 CAR T-cell therapy in relapsed or refractory mantle-cell lymphoma. N Engl J Med 2020; 382: 1331–1342.

Lee

Santomasso

Locke

, et al. ASTCT consensus grading for cytokine release syndrome and neurologic toxicity associated with immune effector cells. Biol Blood Marrow Transplant 2019; 25: 625–638.

Hayden

Roddie

Bader

, et al. Management of adults and children receiving CAR T-cell therapy: 2021 best practice recommendations of the European Society for Blood and Marrow Transplantation (EBMT) and the Joint Accreditation Committee of ISCT and EBMT (JACIE) and the European Haematology Association (EHA). Ann Oncol 2022; 33: 259–275.

10.

Zhao

Lin

Song

, et al. Universal CARs, universal T cells, and universal CAR T cells. J Hematol Oncol 2018; 11: 132.

11.

Orlando

Han

Tribouley

, et al. Genetic mechanisms of target antigen loss in CAR19 therapy of acute lymphoblastic leukemia. Nat Med 2018; 24: 1504–1506.

12.

Grajales

Sheps

, et al. Social media: a review and tutorial of applications in medicine and health care. J Med Internet Res 2014; 16: 13.

13.

Wang

McKee

Torbica

, et al. Systematic literature review on the spread of health-related misinformation on social media. Soc Sci Med 2019; 240: 112552.

14.

Wang

Song

, et al. The reliability and quality of short videos as a source of dietary guidance for inflammatory bowel disease: cross-sectional study. J Med Internet Res 2023; 25: e41518.

15.

Niu

Hao

Yang

, et al. Quality of pancreatic neuroendocrine tumor videos available on TikTok and Bilibili: content analysis. JMIR Form Res 2024; 8: e60033.

16.

Mueller

Jungo

Cajacob

, et al. The absence of evidence is evidence of non-sense: cross-sectional study on the quality of psoriasis-related videos on YouTube and their reception by health seekers. J Med Internet Res 2019; 21: e11935.

17.

Charnock

Shepperd

Needham

, et al. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health 1999; 53: 105–111.

18.

Bernard

Langille

Hughes

, et al. A systematic review of patient inflammatory bowel disease information resources on the World Wide Web. Am J Gastroenterol 2007; 102: 2070–2077.

19.

Liu

Chen

Lin

, et al. YouTube/ Bilibili/TikTok videos as sources of medical information on laryngeal carcinoma: cross-sectional content analysis study. BMC Public Health 2024; 24: 1594.

20.

Zhou

Shen

Kong

. A study of text classification algorithms for live-streaming e-commerce comments based on improved BERT model. PLoS One 2025; 20: e0316550.

21.

Liu

Sun

Hong

, et al. Reproduction, cultural symbolism, and online relationship: constructing city spatial imagery on TikTok. Front Psychol 2022; 13: 1080090.

22.

Mouhssine

Maher

Gaidano

. A STEP ahead for CAR-T cell therapy of large B cell lymphoma: understanding the molecular determinants of resistance. Transl Cancer Res 2023; 12: 2970–2975.

23.

Tian

. A study of the deconstruction and construction of self-efficacy in internet use among older people. BMC Geriatr 2025; 25: 55.

24.

Liu

Yang

Cheng

, et al. Mixed methods research on satisfaction with basic medical insurance for urban and rural residents in China. BMC Public Health 2020; 20: 1201.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB