Sage Journals: Discover world-class research

Abstract

Background

Despite being a prevalent peripheral vestibular disorder in China, Meniere's disease (MD) suffers from low awareness, frequent misdiagnosis, and unsatisfactory treatment rates. As TikTok has become a prominent source of health information, no study has systematically evaluated the quality of its MD-related content. We therefore assessed the accuracy and reliability of MD videos on Chinese TikTok.

Methods

Top 100 videos for “Meniere's disease/syndrome” (TikTok, 1 May 2025) were analyzed. Quality was assessed using Video Information and Quality Index (VIQI), Global Quality Score (GQS), modified DISCERN (mDISCERN), and Patient Education Materials Assessment Tool for Audio-Visual Content (PEMAT-A/V). Descriptive statistics, correlation analyses, and predictive modeling were applied to 83 valid videos.

Results

Among 83 videos, 91.6% (n = 76) were physician-uploaded (primarily otolaryngologists/neurologists). Monologue, Q&A, and medical scenario formats showed superior quality. Symptoms dominated content (47%). Neurologists generated significantly higher normalized engagement per second than otolaryngologists (all adj. p < 0.05, r ＞ 0.35). Physicians outperformed news agencies in GQS scores (adj. p < 0.05, r = 0.291). Otolaryngologists scored higher than both neurologists and Traditional Chinese Medicine practitioners in PEMAT-A/V Understandability (all adj. p < 0.05, r > 0.37). Attending physicians exceeded chief physicians on all quality metrics (all adj. p < 0.05, r ＞ 0.35), an advantage potentially linked to their younger age, greater digital literacy, and more frequent social media use. Engagement metrics (likes, comments, favorites, shares) correlated strongly (r > 0.8). Predictive models for PEMAT-U/A were significant (p < 0.001), lacking multicollinearity/autocorrelation.

Conclusion

Physician-created MD content ensures credibility but requires quality improvement. PEMAT-U/A models guide enhancements, though broader application needs validation. Key health informatics priorities include certified creator engagement, algorithm optimization, and innovative content design.

Keywords

Meniere's disease TikTok health information quality VIQI GQS mDISCERN PEMAT-A/V

Introduction

Meniere's disease (MD) is pathologically characterized by idiopathic endolymphatic hydrops. Its pathogenesis remains complex and incompletely understood.¹ Clinically, it manifests as recurrent spontaneous rotational vertigo, fluctuating and progressive sensorineural hearing loss, tinnitus, and/or aural fullness. The condition has a prevalence ranging from 50 to 200 cases per 100,000 adults and occurs most frequently in middle-aged women, with the majority of patients between 30 and 60 years of age.^1,2 Diagnosis relies heavily on clinical history in the absence of a gold-standard test, and while no definitive cure exists, current therapies primarily aim to control symptoms and improve quality of life.^2,3

Social media has emerged as a significant channel for individuals to access medical information, where many patients search for relevant content online both before and after consulting healthcare providers.⁴ TikTok, as a leading short-video sharing platform, offers a wide range of content, including numerous videos related to healthcare. In China, TikTok has up to 600 million active users, making it an important channel for disseminating health information.⁵ Although TikTok has great potential for spreading public health information, the quality of disease-related videos on the platform is inconsistent, and the accuracy and reliability of some information need further verification. Previous studies have focused on assessing the quality of TikTok videos covering various conditions, such as mitral valve regurgitation, gallstones, COVID-19, diabetes, chronic obstructive pulmonary disease, and atopic dermatitis.^6–11 However, no study has yet evaluated the quality of MD-related videos on TikTok. Therefore, we investigated MD-related videos on TikTok to identify their upload sources, content, and characteristics and to further assess video quality. This study aims to provide the public with evidence-based guidance for accessing reliable MD information online and to offer actionable recommendations for content creators and platforms.

Methods

Ethical considerations

No formal ethical approval was required for this study. The data comprised only publicly accessible TikTok videos that contained no personally identifiable information, accordingly, the project was classified as exempt from NHS Research Ethics Committee review following the UK Health Research Authority decision tool.¹²

Search strategy and video selection

This study was conducted and reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline for cross-sectional studies.¹³ On 1 May 2025, from 9:00 AM to 11:00 AM, a search was conducted on the Chinese version of TikTok using the terms “梅尼埃病” (Meniere's disease) and “梅尼埃综合征” (Meniere's syndrome). The search term “梅尼埃病” yielded only 19 videos, whereas hundreds of videos were retrieved using “梅尼埃综合征.” To minimize potential bias, the search was performed while logged out of any personal accounts and without applying any content filters. The analysis was restricted to the first 100 videos, as previous studies^14–16 have demonstrated that videos beyond this range do not significantly influence the analytical outcomes. After excluding duplicates and irrelevant content, a total of 83 TikTok videos were included in the study.

Video assessment process

The quality of the included videos was assessed using a structured process. All videos were collected and downloaded by one individual (XW). Two authors (XW and DLL) independently evaluated the videos using the assessment tools detailed below. Initial scores were determined through discussion. In cases of persistent disagreement, an arbitrator (ZYL) made the final decision. All authors subsequently approved each final rating. The intraclass correlation coefficient (ICC) and a two-way fixed-effects model were used to assess the scores between raters (XW and DLL). ICC values range from 0 to 1 and were interpreted as follows: < 0.5 (poor agreement), 0.5 to 0.75 (moderate agreement), 0.75 to 0.90 (good agreement), and > 0.90 (excellent agreement).¹⁷

Assessment tools and criteria

Video quality was assessed using four validated tools: the Video Information and Quality Index (VIQI)^16,18 for transmissibility; the modified DISCERN (mDISCERN) tool^19–22 for reliability; the Global Quality Score (GQS)^7,9,23,24 for overall information quality; and the Patient Education Materials Assessment Tool for Audiovisual Materials (PEMAT-A/V)^25,26 for public educational impact. These assessments were applied to all eligible videos.

Firstly, the VIQI was employed to evaluate video transmissibility. It encompasses four dimensions: information flow (VIQI 1), information accuracy (VIQI 2), quality (one point is awarded for each inclusion of an image, animation, interview, video subtitles, and summary in the video) (VIQI 3), and precision (the coherence between the video title and content) (VIQI 4). Each criterion is scored on a scale of 1 to 5, with higher scores indicating better quality.

Secondly, the mDISCERN tool was used to analyze video reliability and quality. This tool, validated for health videos on platforms like YouTube, includes five yes/no questions: (1) Is the video's objective clear and achieved? (2) Are reliable information sources used? (3) Is the information presented in a balanced and unbiased manner? (4) Are other information sources listed for patient reference? (5) Are areas of uncertainty mentioned? Each question is scored as 1 (indicating “yes”) or 0 (indicating “no”). Higher scores denote greater reliability.

Thirdly, the GQS was utilized to assess the quality of information in the videos. The GQS, widely recognized for evaluating the quality of health information on online video platforms, includes five criteria: (1) poor quality (poor information flow, most information missing, not useful for patients); (2) generally poor quality (poor flow, some information provided but many important topics missing, very limited use for patients); (3) moderate quality (deficient flow, some important information discussed adequately while others are not, somewhat useful for patients); (4) good quality (generally good flow, most relevant information listed, useful for patients despite some missing topics); and (5) excellent quality (excellent flow and highly useful information for patients). Higher scores indicate better video quality.

Lastly, the PEMAT-A/V was used to evaluate the educational impact of the video materials on the public. The PEMAT-A/V, designed specifically for assessing audiovisual materials, consists of 17 questions: 13 questions assess the understandability of the health information provided, while 4 questions assess the actionability of the recommendations. Responses are scored as “agree” (1), “disagree” (0), or “N/A.” Total scores, as well as subscores for understandability and actionability, are calculated using the formula “total score/total possible score × 100(%).” Higher scores indicate better understandability or actionability, or both.

Data extraction and video characterization

For each included video, the following characteristics were recorded and analyzed: title, uploader, uploader's identity and follower count, verification status and type, video length (seconds), upload date, engagement data (likes, comments, favorites, and shares) and their normalized values per second, and online persistence.

Uploaders were classified as physicians (otolaryngologists, neurologists, Traditional Chinese Medicine (TCM) practitioners, and other healthcare professionals), medical institutions, or news organizations. Content topics included epidemiology, etiology/prevention, symptoms, examination/diagnosis, and treatment. Videos not addressing these topics were excluded.

Non-original videos (reposts/translations/minimally edited) were identified. Presentation styles were classified into six formats (Solo Narration, Q&A, PPT/Class, Animation/Action, Medical Scenarios, TV show/Documentary) using a predefined coding scheme. Inter-rater reliability was assessed and showed substantial agreement (Cohen's κ = 0.828). Detailed definitions, examples, and the full coding scheme are provided in Supplemental material S1.

Data analysis

Statistical analyses were performed using IBM SPSS 27.0. The Shapiro–Wilk test was applied to assess the normality of continuous variables. Data are presented as mean ± standard deviation (SD) for normally distributed variables, median (range) for non-normally distributed variables, and counts (proportions) for categorical variables.

Given the non-normal distribution of the data, non-parametric tests were employed for group comparisons. The Kruskal–Wallis test was used for multigroup comparisons, with the effect size reported as η²H.²⁷ If significant, post hoc pairwise comparisons were conducted using Dunn's test with Bonferroni correction, with the effect size calculated as r.²⁸ The Mann–Whitney U test was used for two-group comparisons, with r likewise reported as the effect size. Effect sizes were interpreted according to Cohen's criteria: for η²H, 0.01 (small), 0.06 (medium), and 0.14 (large); for r, 0.1 (small), 0.3 (medium), and 0.5 (large).^27,28

Correlations were assessed using Spearman's rank correlation coefficient. Stepwise regression (bidirectional; entry p < 0.05, removal p > 0.10) was used to identify predictors of PEMAT-A/V Understandability (PEMAT-U) and PEMAT-A/V Actionability (PEMAT-A) scores. Multicollinearity was acceptable (variance inflation factors (VIF) < 5). Statistical significance was set at p < 0.05 (two-tailed).

Figures and charts were created using OriginPro2024 software (OriginLab Corporation, Northampton, MA, USA).

Results

A total of 83 TikTok videos on MD were included (Figure 1). These videos garnered substantial public engagement, with a cumulative 308,138 interactions. The median video duration was 66 seconds (range 6–1083), and the median online persistence was 434 days (Table 1).

Figure 1.

Search strategy and video screening procedure.

Table 1.

General characteristics of the videos.

Characteristics	Mean (SD)	Median (range)
Video length (seconds)	98.99 (145.30)	66 (6–1083)
Duration on TikTok (days)	487.87 (369.32)	434 (55–1338)
Thumbs up	1539.48 (3224.98)	517 (13–22000)
Comments	148.87 (246.54)	56 (0–1472)
Favorites	844.75 (1947.73)	219 (1–12000)
Sharing (counts)	1185.43 (2422.06)	240 (2–14000)
VIQI score	11.16 (0.74)	11 (8–15)
VIQI1	3.06 (0.77)	3 (2–5)
VIQI2	3.25 (0.73)	3 (1–5)
VIQI3	1.43 (1.43)	1 (1–4)
VIQI4	3.40 (3.40)	3 (2–5)
GQS score	3.43 (0.74)	3 (2–5)
mDISCERN	2.99 (0.59)	3 (2–5)
mDISCERN-1	0.88 (0.33)	1 (0–1)
mDISCERN-2	0.98 (0.15)	1 (0–1)
mDISCERN-3	0.99 (0.11)	1 (0–1)
mDISCERN-4	0.07 (0.26)	0 (0–1)
mDISCERN-5	0 (0)	0 (0)
PEMAT-U (%)	73.73 (16.08)	78 (25–100)
PEMAT-A (%)	51 (20.09)	50 (0–100)

GQS: Global Quality Score; mDISCERN: modified DISCERN; PEMAT: Patient Education Materials Assessment Tool; VIQI: Video Information and Quality Index.

Healthcare professionals dominated content creation (91.6%, 76/83 videos), whereas medical institutions contributed 6.0% (5/83) and news media 2.4% (2/83) (Figure 2(a)). Among physician-uploaded videos, otolaryngologists represented the primary specialty (61.8%, 47/76), followed by neurologists (22.4%, 17/76) and TCM practitioners (11.8%, 9/76), with senior clinicians (chief/associate chief physicians) producing 73.7% (56/76) of physician-led content (Figure 2(b)). Analysis of presentation formats revealed limited diversity, with solo narration accounting for 55.4% (46/83) of videos and medical scenarios comprising 18.1% (15/83), while participatory formats were notably absent (Figure 2(c)).

Figure 2.

Characteristics of TikTok videos related to Meniere's disease. (a) Distribution of video sources and physician uploaders. (b) Professional titles of physicians appearing in videos. (c) Video presentation styles.

General data comparisons

Medical specialty emerged as the most significant factor influencing viewer engagement. Specifically, neurologists generated significantly higher normalized engagement per second than otolaryngologists. This difference was statistically significant for likes (adj. p = 0.035, r = 0.345), favorites (adj. p = 0.015, r = 0.379), and shares (adj. p = 0.021, r = 0.366). Additionally, a nonsignificant trend favoring TCM practitioners over otolaryngologists was also observed.

For professional title, attending physicians showed a nonsignificant tendency for higher favorites per second than chief physicians. However, a significant difference was noted in video duration, as their videos were significantly longer than those by associate chief physicians (adj. p = 0.002, r = 0.516). News agency videos, despite high raw engagement, showed no significant effect due to small sample size (n = 2).

In terms of presentation format, video format showed no overall effect on normalized engagement rates. Nevertheless, the higher absolute engagement seen in monologue and Q&A formats (Supplemental Table 1) corresponded to their significantly longer duration compared to PPT/Class-style videos (all adj. p < 0.05, r > 0.38).

Video categorization and assessment of video quality and reliability

TikTok videos exhibited high originality (97.6%), with 63.9% focusing on single topics. Symptoms constituted the most prevalent subject (47%), predominantly presented by senior clinicians (chief/associate chief physicians), while examination/diagnosis ranked second (34.9%), frequently addressing differential diagnosis of MD versus vestibular migraine/Benign Paroxysmal Positioning Vertigo (BPPV) (Table 2). Quality assessment yielded moderate scores (Table 1): VIQI (mean 11.16, range 8–15), GQS (mean 3.43, range 2–5), and mDISCERN (mean 2.99, range 2–5). PEMAT analysis demonstrated higher understandability (PEMAT-U: mean 73.73%) than actionability (PEMAT-A: mean 51.00%), indicating clear information delivery but insufficient actionable guidance. Inter-rater reliability was robust across instruments (ICC range: 0.780–0.946).

Table 2.

Categorization and scores of the videos.

	N (83)	Rate (%)
Originality	81	97.6
Number of topics per video
1	53	63.9
2	27	32.5
3	3	3.6
Type of topics
Epidemiology	2	2.4
Etiology/prevention	21	25.3
Symptoms	39	47
Examinations/diagnosis	29	34.9
Treatment	25	30.1
Scale, score
mDISCERN
1	0	0
2	12	14.5
3	63	75.9
4	5	6.0
5	3	3.6
Global Quality Score
1	0	0
2	6	7.2
3	41	49.4
4	30	36.1
5	6	7.2
VIQI
0–5	0	0
6–10	38	45.8
11–15	45	54.2
16–20	0	0
PEMAT-U Score
(range 0–25)	1	1.2
(range 26–50)	8	9.6
(range 51–75)	29	34.9
(range 76–100)	45	54.2
PEMAT-A Score
(range 0–33.33)	35	42.2
(range 33.34–66.67)	35	42.2
(range 66.68–100)	13	15.6

mDISCERN: modified DISCERN; PEMAT: Patient Education Materials Assessment Tool; VIQI: Video Information and Quality Index.

Subgroup analyses revealed significant variations in video quality across multiple dimensions (Table 3 and Figures 3–5). Physicians achieved higher GQS scores than news organizations (adj. p = 0.030, r = 0.291). Among physician specialties, otolaryngologists attained superior GQS relative to TCM practitioners (adj. p = 0.027, r = 0.379) and demonstrated significantly higher PEMAT-U scores than both neurologists (adj. p = 0.017, r = 0.374) and TCM practitioners (adj. p = 0.007, r = 0.433). Regarding professional rank, attending physicians outperformed both chief physicians and associate chief physicians across all core quality metrics. Specifically, relative to chief physicians, attending physicians had higher scores in VIQI (adj. p = 0.010, r = 0.409), GQS (adj. p = 0.015, r = 0.390), PEMAT-U (adj. p = 0.003, r = 0.458), and PEMAT-A (adj. p = 0.017, r = 0.384). The comparisons with associate chief physicians showed similar superior performance for attending physicians. Presentation format also significantly influenced quality: PPT/Class formats had lower VIQI than monologue and medical scenarios (adj. p = 0.002 & 0.017, all r ＞ 0.35), while Q&A formats achieved higher mDISCERN than PPT/Class (adj. p = 0.034, r = 0.741).

Figure 3.

Evaluation of video quality according to uploaders. (a) Comparison of VIQI, GQS, and mDISCERN scores among different uploaders. (b) Comparison of PEMAT-U and PEMAT-A scores among different uploaders (the error bars represent the minimum to maximum values). GQS: Global Quality Score; mDISCERN: modified DISCERN; PEMAT: Patient Education Materials Assessment Tool; VIQI: Video Information and Quality Index.

Figure 4.

Assessment of video quality based on medical discipline. (a) Comparison of VIQI, GQS, and mDISCERN scores among different doctors and other healthcare professionals. (b) Comparison of PEMAT-U and PEMAT-A scores among different doctors and other healthcare professionals (the error bars represent the minimum to maximum values). GQS: Global Quality Score; mDISCERN: modified DISCERN; PEMAT: Patient Education Materials Assessment Tool; VIQI: Video Information and Quality Index.

Figure 5.

Assessment of video quality based on professional titles. (a) Comparison of VIQI, GQS, and mDISCERN scores among different professional titles. (b) Comparison of PEMAT-U and PEMAT-A scores among different professional titles (the error bars represent the minimum to maximum values). GQS: Global Quality Score; mDISCERN: modified DISCERN; PEMAT: Patient Education Materials Assessment Tool; VIQI: Video Information and Quality Index.

Table 3.

Quality assessment of videos based on uploaders and presentation format.

Characteristic	n	VIQI	GQS	mDISCERN	PEMAT-U (%)	PEMAT-A (%)
Uploaders	83
Doctors	76	11 (8, 15)	3 (2, 5)	3 (2, 5)	79 (25, 100)	50 (0, 100)
Medical organization	5	10 (10, 12)	3 (3, 3)	3 (3, 3)	67 (67, 85)	33 (33, 67)
News agency	2	8.5 (8, 9)	2 (2, 2)	2.5 (2, 3)	72.5 (67, 78)	46 (25, 67)
p-Value		0.069	0.011	0.405	0.637	0.356
η²H		0.042	0.089	−0.002	−0.026	0.009
Doctors	76
Otolaryngologist	47	11 (8, 15)	4 (2, 5)	3 (2, 5)	80 (25, 100)	67 (0, 100)
Neurologist	17	10 (9, 14)	3 (2, 4)	3 (2, 5)	70 (50, 83)	33 (33, 67)
TCM practitioner	9	11 (9, 13)	3 (2, 4)	3 (2, 3)	67 (33, 80)	33 (25, 75)
Other healthcare professionals	3	13 (9, 13)	3 (2, 4)	3 (2, 3)	78 (50, 85)	50 (25, 75)
p-Value		0.730	0.006	0.092	0.001	0.047
η²H		−0.02	0.13	0.05	0.18	0.07
Professional titles
Chief physician	32	10 (9, 14)	3 (3, 4)	3 (2, 5)	71.5 (25, 100)	50 (0, 100)
Associate chief physician	24	11 (8, 14)	3 (2, 4)	3 (2, 5)	75.5 (38, 89)	50 (25, 75)
Attending physician	20	13 (9, 15)	4 (2, 5)	3 (2, 5)	84 (33, 100)	67 (25, 100)
p-Value		0.007	0.012	0.343	0.003	0.006
η²H		0.11	0.09	＜0.01	0.13	0.11
Presentation format	83
Monologue	46	12 (9, 14)	4 (2, 5)	3 (2, 5)	79 (38, 100)	50 (25, 85)
Question & Answer	3	12 (9, 13)	3 (3, 4)	4 (3, 4)	85 (78, 100)	67 (50, 100)
PPT or Class	13	10 (8, 12)	3 (2, 4)	3 (2, 3)	67 (25, 85)	33 (25, 75)
Animation/action	3	11 (10, 12)	3 (3, 4)	3 (2, 3)	80 (68, 80)	50 (33, 67)
Medical scenarios	15	12 (9, 15)	3 (3, 5)	3 (2, 4)	80 (38, 100)	50 (0, 100)
TVshow/documentary	3	9 (8, 11)	2 (2, 4)	3 (2, 3)	78 (67, 82)	33 (25, 67)
p-Value		0.002	0.107	0.036	0.036	0.312
η²H		0.181	0.081	0.081	0.053	0.001

GQS: Global Quality Score; mDISCERN: modified DISCERN; PEMAT: Patient Education Materials Assessment Tool; TCM: Traditional Chinese Medicine; VIQI: Video Information and Quality Index.

Correlation and stepwise regression analysis

Spearman correlation analysis was conducted to evaluate relationships among video characteristics, accounting for categorical variables and non-normally distributed data. As shown in the correlation heatmap (Figure 6), strong correlations (r > 0.8) were observed among the engagement metrics (likes, comments, favorites, and shares). Their correlations with VIQI were moderate (r = 0.4–0.7), and were even weaker with video length (r = 0.24–0.50). Significant negative correlations were observed between physician title and both engagement metrics and quality parameters, specifically PEMAT-U scores (r = −0.29, p < 0.01), PEMAT-A scores (r = −0.31, p < 0.05), and GQS scores (r = −0.25, p < 0.05).

Figure 6.

Correlations between all video parameters and quality assessment scores.

PEMAT-U scores correlated significantly with Professional title, Doctors/Physicians, and Length, while PEMAT-A scores correlated with Professional title, Doctors, Length, and Number of topics. Stepwise regression analysis produced two predictive models (Table 4): PEMAT-U = 66.389 − 5.711 × Professional titles − 8.118 × Doctors + 0.025 × Length. PEMAT-A = 63.469 − 6.474 × Professional titles − 8.506 × Doctors + 0.037 × Length + 8.298 × Number of topics.

Table 4.

Stepwise regression analysis summary.

Dependent variable	Predictors included	R²	Adjusted R²	Standard error of the estimate	F change	p-Value	Durbin–Watson	ANOVA
Dependent variable	Predictors included	R²	Adjusted R²	Standard error of the estimate	F change	p-Value	Durbin–Watson	F	p
PEMAT-U	ConstantProfessional titlesDoctorsLength	0.265	0.235	14.586	4.34	0.041	1.52	8.670	＜0.001
PEMAT-A	ConstantProfessional titlesDoctorsLengthNumber of topics	0.271	0.23	17.715	5.341	0.024	1.89	6.588	＜0.001

ANOVA: analysis of variance; PEMAT: Patient Education Materials Assessment Tool.

The Durbin–Watson statistics (1.52 and 1.89) indicated no significant residual autocorrelation. Collinearity diagnostics revealed no multicollinearity concerns, with tolerance values exceeding 0.8 and VIF below 2.0 for all predictors (Table 5), supporting model validity.

Table 5.

Stepwise regression coefficients, statistical significance, and collinearity assessment.

Dependent variable	Predictors	Unstandardized coefficients	Standardized coefficients	t-Value	p-Value	95.0% CI for B		Collinearity statistics
Dependent variable	Predictors	B (SE)	β	t-Value	p-Value	Lower bound	Upper bound	Tolerance	VIF
PEMAT-U	(Constant)	96.39 (5.64)		17.09<	< 0.001	85.147	107.631
	Professional titles	−5.71 (2.08)	−0.28	−2.75	0.008	−9.85	−1.573	0.985	1.015
	Doctors	−8.12 (2.08)	−0.415	−3.91	< 0.001	−12.261	−3.975	0.903	1.107
	Length	0.025 (0.01)	0.222	2.08	0.041	0.001	0.048	0.896	1.116
PEMAT-A	(Constant)	63.47 (8.19)		7.75	< 0.001	47.131	79.808
	Professional titles	−6.47 (2.52)	−0.262	−2.57	0.012	−11.507	−1.442	0.983	1.017
	Doctors	−8.51 (2.54)	−0.359	−3.35	0.001	−13.565	−3.448	0.894	1.118
	Length	0.040 (0.01)	0.277	2.59	0.012	0.009	0.066	0.896	1.116
	Number of topics	8.30 (3.59)	0.236	2.311	0.024	1.138	15.458	0.987	1.014

B: unstandardized regression coefficient; β: standardized regression coefficient; CI: confidence interval; PEMAT: Patient Education Materials Assessment Tool.

Discussion

The utilization of social media in public health education has increased substantially, with digital videos becoming a significant medium for patient education over recent decades.²⁹ Within otolaryngology-head and neck surgery, multiple studies have evaluated the educational value of YouTube and TikTok videos addressing conditions including cholesteatoma,³⁰ pediatric tonsillectomy,³¹ tympanostomy tubes,³² rhinoplasty,³³ tinnitus,³⁴ nasopharyngeal carcinoma,³⁵ and laryngeal cancer.³⁶ However, a critical evidence gap persists regarding social media content for MD, with no prior evaluations identified in this domain.

Key findings

This cross-sectional study demonstrates that most TikTok videos about MD originate from certified physicians, predominantly otolaryngologists or neurologists, with all physician identities verified by TikTok's authentication system to ensure professionalism and scientific credibility. Our comprehensive assessment using four validated quality evaluation tools revealed substantial opportunities for improvement in MD-related video quality. The analysis incorporated novel determinants frequently overlooked in existing literature—including content originality, creator certification status, video presentation format, physician specialty, and professional title hierarchy. These findings provide actionable guidance for healthcare consumers seeking reliable health information while offering evidence-based insights for content creators, platform designers, and public health practitioners developing digital health communication strategies.

Analysis of video characteristics and viewer engagement

Analysis of 83 TikTok videos revealed substantial public demand, evidenced by 308,138 cumulative interactions. The median video duration of 66 seconds aligns with platform attention patterns, while median online persistence of 434 days confirms long-term health information accessibility—critical for sustained public health communication.⁶

Healthcare professionals produced 91.6% of the content, a finding consistent with other studies in terms of the overall proportion.³⁷ The content was primarily created by otolaryngologists (61.8%) and neurologists (22.4%), among whom senior clinicians (chief/associate chief physicians) created the majority of physician-led videos (73.7%). This distribution highlights specialized expertise requirements but also reveals participation disparities across professional ranks. Content diversity was limited, marked by the dominance of solo narration (55.4%) and the complete absence of participatory formats, which reduces opportunities for interactive health education.

Medical specialty emerged as the most significant factor influencing viewer engagement per second in our analysis. This study represents the first in this field to normalize engagement rates by video duration, thereby controlling for a key confounding variable. Neurologists generated consistently higher engagement than otolaryngologists, with significant effects for likes, favorites, and shares (all adj. p < 0.05, r > 0.3). This disparity likely stems from differing content approaches and public health interests. Neurologists frequently address high-burden, prevalent conditions such as migraines, stroke, and cognitive disorders, which attract broad public interest. Public perception also plays a role, as disorders presenting with vertigo (e.g. MD) are often mistakenly attributed to neurological causes, further directing audience attention to neurologists. In contrast, otolaryngology content more commonly featured clinical consultations or procedural videos, which may hold narrower appeal for general health education seekers.

Beyond the primary specialty-based findings, other factors showed limited or nonsignificant effects. Regarding professional title, a significant difference was observed in video duration, where attending physicians produced longer videos than associate chief physicians (adj. p = 0.002, r = 0.52). This aligns with their differing content strategies. Attending physicians primarily developed purpose-built educational content, whereas senior clinicians predominantly uploaded videos derived from clinical consultations. Similarly, news agencies produced videos with markedly higher raw engagement metrics, but these comparisons did not reach statistical significance, likely due to the very small subgroup size (n = 2).

Analysis of presentation format showed that formats such as monologue and Q&A were associated with longer videos and higher absolute engagement (Supplemental Table 1). However, this did not result in a statistically significant effect on normalized engagement per second (Table 6), indicating that the observed benefit is mediated primarily by increased video duration.

Table 6.

Normalized engagement metrics of videos based on uploaders and presentation format.

Characteristic	n	Length	Thumbs up/	Comments/	Favorites/	Sharing/
		(seconds)	Length (seconds)	Length (seconds)	/ Length (seconds)	Length (seconds)
Uploaders	83
Doctors	76	74 (6, 1083)	7.90 (0.28, 291.67)	0.76 (0, 15.78)	3.44 (0.02, 208.33)	3.27 (0.06, 250)
Medical organizations	5	37 (33, 55)	10.14 (4.73, 85.95)	1.03 (0.57, 18.14)	6.29 (2.35, 47.05)	4.71 (1.55, 146.51)
News agencies	2	37 (33, 55)	466.38 (35.77, 897)	32.64 (5.07, 60.22)	22.66 (4.53, 40.78)	33.57 (16.47, 50.67)
p-Value		0.121	0.098	0.090	0.326	0.230
η²H		0.028	0.033	0.035	0.003	0.012
Doctors	76
Otolaryngologist	47	76 (8, 500)	4.80 (0.28, 291.67)	0.60 (0, 7.46)	2.30 (0.02, 208.33)	2.13 (0.06, 250)
Neurologist	17	64 (10, 144)	22.44 (1.78, 87.50)	2.34 (0.03, 15.39)	7.85 (0.76, 62.30)	15.79 (0.42, 86.08)
TCM practitioner	9	76 (6, 374)	15.78 (1, 61.28)	3.49 (0.05, 15.78)	8.79 (0.44, 39.44)	15 (0.24, 51.18)
Other health care professionals	3	267 (55, 1083)	1.34 (0.53, 7.22)	0.18 (0.05, 1.22)	1.11 (0.40, 2.31)	1.55 (0.18, 5.84)
p-Value		0.303	0.006	0.017	0.005	0.006
η²H		0.009	0.131	0.100	0.136	0.133
Professional titles	76
Chief physician	32	76.5 (6, 1083)	3.99 (0.28, 291.67)	0.44 (0.00, 7.46)	1.94 (0.02, 208.33)	2.13 (0.06, 250.00)
Associate chief physician	24	41 (9, 359)	12.52 (0.38, 87.50)	1.06 (0.03, 15.78)	5.53 (0.18, 62.30)	5.25 (0.06, 86.08)
Attending physician	20	87.5 (12, 500)	14.30 (0.88, 87.75)	1.15 (0.05, 7.68)	7.28 (0.46, 39.44)	5.84 (0.24, 64.34)
p-Value		0.003	0.074	0.150	0.049	0.111
η²H		0.134	0.044	0.025	0.055	0.033
Presentation format	83
Solo narration	46	78.5 (12, 1083)	8.59 (0.41, 291.67)	0.62 (0.03, 15.39)	4.77 (0.08, 208.33)	4.69 (0.09, 250.00)
Question & Answer	3	127 (90, 259)	4.95 (0.28, 14.70)	0.27 (0.00, 1.80)	4.33 (0.09, 7.92)	3.37 (0.14, 6.01)
PPT or Class	13	12 (6, 374)	7.50 (1.00, 87.50)	1.58 (0.05, 15.78)	2.00 (0.36, 62.30)	2.50 (0.24, 49.00)
Animation / action	3	12 (10, 142)	4.80 (0.88, 59.00)	0.30 (0.07, 5.50)	2.30 (0.55, 18.25)	2.10 (1.38, 28.67)
Medical scenarios	15	55 (18, 152)	15.70 (0.45, 87.75)	2.32 (0.05, 18.14)	3.19 (0.02, 47.05)	2.53 (0.06, 146.51)
TV show / documentary	3	30 (9, 34)	35.77 (0.38, 897.00)	5.07 (0.03, 60.22)	4.53 (0.18, 40.78)	16.47 (0.06, 50.67)
p-Value		＜0.001	0.726	0.380	0.965	0.900
η²H		0.332	−0.028	0.004	−0.052	−0.044

TCM: Traditional Chinese Medicine.

Collectively, these findings underscore several key imperatives for public health communication on short-video platforms. These include optimizing content duration for platform attention patterns, diversifying formats to enhance engagement, and supporting clinicians in developing dedicated educational materials rather than repurposing clinical encounters. In this context, the currently low participation rates from medical institutions (6.0%) and news media (2.4%) highlight valuable yet underutilized channels. Effectively leveraging these channels could significantly scale the dissemination of quality health information.

Video categorization and quality assessment

Tool selection remains critical for evaluating health communication quality. While PEMAT, VIQI, GQS, and mDISCERN are widely employed,^25,38–40 their application to short-form video platforms has limitations. This is primarily due to a mismatch between their criteria for comprehensive information and the concise, narrative-driven nature of short-form media, which may undervalue videos optimized for platform engagement. Correspondingly, TikTok videos demonstrated accessibility advantages but contained constrained information density, resulting in suboptimal quality scores.

VIQI assessment revealed TikTok's high user engagement strengthened performance on VIQI-1 metrics. However, monotonous presentation styles, which relied heavily on verbal narration and offered limited visual enhancements, reduced VIQI-3 scores. Attending physicians’ videos, featuring more diverse presentation methods, achieved significantly higher VIQI scores than senior colleagues.

mDISCERN evaluation indicated moderate but unsatisfactory credibility. Strict physician verification boosted mDISCERN-1 and mDISCERN-3 scores, yet critical deficiencies emerged in mDISCERN-4/5 due to absent information sources and TikTok's lack of mandatory review mechanisms. This underscores the public health imperative for transparent source documentation.

GQS assessment yielded an average score of 3.43 (range 2–5), consistent with existing literature.⁴¹ Videos produced by physicians demonstrated superior quality compared to news media. Specifically, otolaryngologists generated higher-quality content than neurologists and TCM practitioners, and attending physicians outperformed senior colleagues (Table 3 and Figures 3–5). This observed advantage likely stems from complementary factors: otolaryngologists benefit from content that is more closely aligned with their core surgical specialty, whereas attending physicians may leverage their generally younger profile and greater familiarity with social media as digital natives to produce more comprehensible content.

PEMAT assessment revealed adequate understandability (73.73%) but inadequate actionability (51%), falling below the 70% clinical acceptability threshold.²⁵ This “understandability-actionability gap” is a recurring pattern in evaluations of patient education materials across digital platforms, aligning with similar findings where providing clear information proves easier than outlining concrete, actionable steps.^7,41 Within our results, attending physicians’ videos and solo narration/medical scenarios achieved higher PEMAT-U scores, attributed to diversified production styles and detailed explanations. Conversely, senior physicians’ outpatient-derived content and text-heavy PPT formats underperformed. Moreover, otolaryngologists showed higher PEMAT-A scores than TCM practitioners before multiplicity adjustment (p < 0.05), a trend consistent with prior findings of lower-quality scores for TCM-related health content on short-video platforms.⁴²

In summary, educational videos produced by otolaryngology attending physicians, particularly those using solo narration, achieve higher quality. This high-quality content significantly increases public awareness of MD, enabling patients to seek initial specialty care from otolaryngology rather than neurology departments. Precise first-contact alignment corrects historical referral discrepancies and optimizes clinical pathways.

Correlation and stepwise regression analysis of video quality and characteristics

Spearman correlation analysis showed significant associations between TikTok video characteristics and quality metrics. Weak to moderate positive correlations emerged between video length and engagement metrics. While not strong, this trend tentatively aligns with established patterns in health information dissemination, where greater engagement often predicts broader reach, possibly indicating that increased video length contributes to content value or viewer retention.⁴³ Weaker correlations between video features and PEMAT-A scores indicate audience engagement does not necessarily translate to actionable health content. Observed negative correlations between professional titles and both engagement metrics and PEMAT-U/A scores warrant careful interpretation, as these may reflect methodological considerations rather than causal relationships.

Stepwise regression identified “Length,” “Professional titles,” “Doctors,” and “Number of topics” as significant predictors of PEMAT scores. The predictive models demonstrated substantial explanatory power, as evidenced by the regression results (Table 4). The models indicate that professional titles and doctor status were significant negative predictors. However, these negative coefficients should be interpreted cautiously, as they may reflect statistical coding conventions (e.g. 1 = attending and 3 = chief physicians) rather than intrinsic quality differences. Overall, these findings highlight how platform-specific features and content design choices influence health communication effectiveness.

Limitations

While this study utilized four validated assessment tools and involved three clinician raters, its findings are constrained because these tools lack formal validation for short-video formats on Chinese platforms. Potential systemic biases may therefore persist despite rigorous methodology. The analyzed content may not fully represent all MD-related topics, and our correlational design precludes causal inferences. Although searches were conducted while logged out to standardize the process and minimize individual-level personalization, TikTok's underlying recommendation algorithm remains unaccounted for in this analysis and could influence content visibility and engagement patterns as a potential source of bias. Platform user demographics and their variable health content interactions may further limit generalizability. Finally, the cross-sectional design with single-day sampling captures only a temporal snapshot, potentially missing longitudinal content evolution. This specific methodological limitation underscores the need for future research to develop and validate assessment tools specifically adapted to the Chinese social media landscape.

Conclusion

Most MD-related TikTok content originates from certified physicians, ensuring foundational credibility. However, overall quality remains less than optimal across multiple validated metrics (PEMAT-A: 51%; GQS: 3.43/5), revealing significant gaps in actionable guidance and production value. The weak correlation between dissemination metrics and quality scores underscores that popularity does not equate to reliability. While the derived PEMAT predictive equations offer valuable insights for content design, their application requires contextual caution. These findings highlight critical digital health needs: (1) content creators should enhance informational depth and presentation quality, (2) platforms must implement robust medical content review systems, and (3) future research should explore algorithm-mediated health information dissemination to optimize public health communication efficacy.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076261418919 - Supplemental material for Physician-dominated yet suboptimal: Evaluating the quality of Meniere's disease information on TikTok in China

Supplemental material, sj-docx-1-dhj-10.1177_20552076261418919 for Physician-dominated yet suboptimal: Evaluating the quality of Meniere's disease information on TikTok in China by Xin Wang, Dongling Lian and Zeyang Liu in DIGITAL HEALTH

Footnotes

Acknowledgements

The authors would like to express their gratitude to the video uploaders for their contributions to public health.

ORCID iDs

Xin Wang

Dongling Lian

Zeyang Liu

Ethical approval

No formal ethical approval was required for this study. The data comprised only publicly accessible TikTok videos that contained no personally identifiable information; accordingly, the project was classified as exempt from NHS Research Ethics Committee review following the UK Health Research Authority decision tool.¹²

Contributorship

XW was involved in conceptualization, methodology, investigation, formal analysis, and writing—original draft; DLL in investigation and validation; and ZYL in supervision and writing—review & editing. All authors read and approved the final manuscript.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The data for this study, derived from the TikTok platform, have been anonymized to protect privacy, with URLs and titles retained for research integrity. Because of privacy concerns, these data are not publicly accessible. However, we are committed to sharing the data upon request for legitimate research purposes, in accordance with privacy protection principles and data sharing policies. Interested researchers should contact the corresponding author for access.

Guarantor

XW.

Supplemental material

Supplemental material for this article is available online.

References

Basura

Adams

Monfared

, et al. Clinical practice guideline: Meniere’s disease. Otolaryngol Head Neck Surg 2020; 162: S1–S55.

Harcourt

Barraclough

Bronstein

. Meniere’s disease. Br Med J 2014; 349: g6544.

Mohseni-Dargah

Falahati

Pastras

, et al. Meniere’s disease: pathogenesis, treatments, and emerging approaches for an idiopathic bioenvironmental disorder. Environ Res 2023; 238: 116972.

McMullan

Berle

Arnáez

, et al. The relationships between health anxiety, online health information seeking, and cyberchondria: systematic review and meta-analysis. J Affect Disord 2019; 245: 270–278.

Song

Zhao

Yao

, et al. Serious information in hedonic social applications: affordances, self-determination and health information adoption in TikTok. J Doc 2022; 78: 890–911.

Cui

Cao

, et al. Quality assessment of TikTok as a source of information about mitral valve regurgitation in China: cross-sectional study. J Med Internet Res 2024; 26: e55403.

Kong

Song

Zhao

, et al. Tiktok as a health information source: assessment of the quality of information in diabetes-related videos. J Med Internet Res 2021; 23: e30409.

Ostrovsky

Chen

. Tiktok and its role in COVID-19 information propagation. J Adolesc Health 2020; 67: 730.

Song

Xue

Zhao

, et al. Short-video apps as a health information source for chronic obstructive pulmonary disease: information quality assessment of TikTok videos. J Med Internet Res 2021; 23: e28318.

10.

Sun

Zheng

. Quality of information in gallstone disease videos on TikTok: cross-sectional study. J Med Internet Res 2023; 25: e39162.

11.

Yang

Duan

. Assessment of health information in Chinese atopic dermatitis-related videos: a cross-sectional study. Digit Health 2025; 11: 20552076251346579.

12.

Is my study research?. London: Health Research Authority, cited 2025 Jul 29, Available from: https://www.hra-decisiontools.org.uk/research/.

13.

von Elm

Altman

Egger

, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet 2007; 370: 1453–1457.

14.

Ferhatoglu

Kartal

Ekici

, et al. Evaluation of the reliability, utility, and quality of the information in sleeve gastrectomy videos shared on open access video sharing platform YouTube. Obes Surg 2019; 29: 1477–1484.

15.

Mueller

Hongler

VNS

Jungo

, et al. Fiction, falsehoods, and few facts: cross-sectional study on the content-related quality of atopic eczema-related videos on YouTube. J Med Internet Res 2020; 22: e15599.

16.

Lian

Pan

, et al. Assessing the quality of breast cancer-related videos on TikTok: a cross-sectional study. Digit Health 2024; 10: 20552076241277688.

17.

Koo

. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 2016; 15: 155–163.

18.

Tamošiūnaitė

Vasiliauskas

Dindaroğlu

. Does YouTube provide adequate information about orthodontic pain? Angle Orthod 2023; 93: 403–408.

19.

Liu

Peng

, et al. Assessment of the reliability and quality of breast cancer related videos on TikTok and bilibili: cross-sectional study in China. Front Public Health 2023; 11: 1296386.

20.

Demirtas

Alici

. The reliability and quality of YouTube videos as a source of breath holding spell. Ital J Pediatr 2024; 50: 8. 10.1186/s13052-023-01570-0

21.

Singh

. YouTube for information on rheumatoid arthritis—a wakeup call? J Rheumatol 2012; 39: 899–903.

22.

Delli

Livas

Vissink

, et al.

Is YouTube useful as a source of information for Sjögren’s syndrome?

Oral Dis 2016; 22: 196–201.

23.

Castillo

Wassef

, et al. YouTube as a source of patient information for prenatal repair of myelomeningocele. Am J Perinatol 2021; 38: 140–144.

24.

Chen

Wang

Huang

, et al. The quality and reliability of short videos about thyroid nodules on BiliBili and TikTok: cross-sectional study. Digit Health 2024; 10: 20552076241288831.

25.

Shoemaker

Wolf

Brach

. Development of the patient education materials assessment tool (PEMAT): a new measure of understandability and actionability for print and audiovisual patient information. Patient Educ Couns 2014; 96: 395–403.

26.

Yeung

Abi-Jaoude

. Tiktok and attention-deficit/hyperactivity disorder: a cross-sectional study of social media content quality. Can J Psychiatry 2022; 67: 899–906.

27.

López-Martín

Ardura-Martínez

. El tamaño del efecto en la publicación científica. Educación XX1 2023; 26: 9–17.

28.

Fritz

Morris

Richler

. Effect size estimates: current use, calculations, and interpretation. J Exp Psychol Gen 2012; 141: 2–18.

29.

Yao

, et al. Short video platforms as sources of health information about HPV vaccine: a content and quality analysis. Digit Health 2025; 11: 20552076251379340.

30.

Reddy

Cheng

Jufas

, et al. Assessing the quality of patient information for cholesteatoma on the video sharing platform YouTube. Otol Neurotol 2023; 44: e230–e234.

31.

Strychowsky

Nayan

Farrokhyar

, et al.

YouTube: a good source of information on pediatric tonsillectomy?

Int J Pediatr Otorhinolaryngol 2013; 77: 972–975.

32.

Sorensen

Pusz

Brietzke

. YouTube as an information source for pediatric adenotonsillectomy and ear tube surgery. Int J Pediatr Otorhinolaryngol 2014; 78: 65–70.

33.

Oremule

Patel

Orekoya

, et al. Quality and reliability of YouTube videos as a source of patient information on rhinoplasty. JAMA Otolaryngol Head Neck Surg 2019; 145: 282–283.

34.

Huang

Lan

Jiang

, et al. The quality and reliability of patient education regarding sound therapy videos for tinnitus on YouTube. PeerJ 2024; 12: e16846.

35.

Tan

DJY

Fan

. The readability and quality of web-based patient information on nasopharyngeal carcinoma: quantitative content analysis. JMIR Form Res 2023; 7: e47762.

36.

Liu

Chen

Lin

, et al. YouTube/bilibili/TikTok videos as sources of medical information on laryngeal carcinoma: cross-sectional content analysis study. BMC Public Health 2024; 24: 1594.

37.

Zhang

, et al. Evaluating the reliability and quality of knee osteoarthritis educational content on TikTok and bilibili: a cross-sectional content analysis. Digit Health 2025; 11: 20552076251366390.

38.

Shan

Xing

Dong

, et al. Translating and adapting the DISCERN instrument into a simplified Chinese version and validating its reliability: development and usability study. J Med Internet Res 2023; 25: e40733.

39.

Azer

. Are DISCERN and JAMA suitable instruments for assessing YouTube videos on thyroid cancer? Methodological concerns. J Cancer Educ 2020; 35: 1267–1277.

40.

Vishnevetsky

Walters

Tan

. Interrater reliability of the patient education materials assessment tool (PEMAT). Patient Educ Couns 2018; 101: 490–496.

41.

Morra

Collà Ruvolo

Napolitano

, et al. YouTubeTM as a source of information on bladder pain syndrome: a contemporary analysis. Neurourol Urodyn 2022; 41: 237–245.

42.

Zheng

Tong

Wan

, et al. Quality and reliability of liver cancer-related short Chinese videos on TikTok and bilibili: cross-sectional content analysis study. J Med Internet Res 2023; 25: e47210.

43.

Weng

Menczer

Ahn

Y-Y

. Virality prediction and community structure in social networks. Sci Rep 2013; 3: 2522.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.26 MB