Sage Journals: Discover world-class research

Abstract

To mitigate the rapid spread of health misinformation and its negative impact, this study presents a comprehensive literature review on health misinformation detection. A systematic search is conducted using the Google Scholar database, targeting publications from January 2016 to February 2025. Inclusion criteria require full-text, English-language studies proposing health misinformation detection methods. A total of 100 relevant studies are included. The characteristics of health misinformation are identified through a detailed analysis of its concept, dissemination mechanism, psychological impact, and susceptibility. Datasets and evaluation metrics are reviewed, with issues such as class imbalance and inconsistencies in annotation standards being identified. The strengths and limitations of various detection approaches are examined. Machine learning approaches perform better when using ensemble methods, feature selection techniques, and embedding-based representations. Deep learning algorithms are strong in automatic feature extraction and high-dimensional semantic modeling, though they often face challenges such as high computational cost and low interpretability. Advanced detection methods show clear improvements in accuracy and explainability, while also introducing AI-generated misinformation and associated ethical concerns. This review provides a panoramic view of the current state-of-the-art in health misinformation detection. It further underscores the importance of interdisciplinary collaboration, human-centered design, and ethical considerations for the development of effective and clinically relevant detection systems.

Keywords

health misinformation detection concepts and analysis datasets and metrics methodologies machine learning deep learning

Highlights

Health misinformation characteristics are identified, including concept, dissemination, psychological impact, and susceptibility.

Detection datasets are analyzed and evaluation metrics are summarized.

Machine learning performance differences and accuracy boost methods are stressed.

Unimodal and multimodal deep learning methods are explored and limitations are examined.

Advanced methods are reviewed, and AI-generated misinformation risks and ethical considerations are analyzed.

Introduction

In the digital age, the widespread use of the Internet has revolutionized access to health information, and advances in digital technology have made it easier to access and disseminate health information. With the rapid development of the Internet as a 24/7 news and information source, individuals increasingly expect to obtain essential information as part of their daily routine.¹ However, social media has also opened up space for misinformation and facilitated its penetration and proliferation on the internet during health emergencies.² In the majority of the current research, there is a view that digital technology, particularly social media, has amplified the problem of health misinformation.³ Not only official governmental health organizations, but anyone with an account can post and disseminate health information ranging from dietary guidelines to disease treatment on various social media platforms.⁴ Yet patients and individuals often turn to these media sources for information about diseases, treatments, and medications that may pose serious risks to their health.⁵ A recent national survey by the Pew Research Center reported that 7 in 10 (72%) adult Internet users search online for a variety of health issues.⁶ These dynamics highlight both the opportunities and the significant risks that digital technologies pose to public health, underscoring the urgent need for effective strategies to identify, evaluate, and mitigate health misinformation online.

Several reviews have recently examined health misinformation from different perspectives, including its scope and prevalence, the factors driving its creation and spread, its negative social and psychological impacts, and the strategies proposed for mitigation and intervention. Heley et al conducted a scoping review of 115 publications published between 2017 and 2022, focusing on misinformation mitigation interventions such as correction, education, and prebunking.⁷ Borah examined the politicization of health issues in relation to misinformation and social media.⁸ Zhang et al discussed the factors contributing to the creation and spread of health misinformation, its negative impacts, possible solutions, and the research methods used in prior studies.⁹ Kbaier et al carried out a scoping review on health misinformation in social media, addressing its prevalence, impacts, mitigation strategies, and the experiences of health professionals in responding to it.³ Peng et al systematically reviewed persuasive strategies used in online health misinformation and categorized 238 strategies into thematic groups.¹⁰ Southwell et al reviewed the relationship between health misinformation and health disparities, showing how vulnerable groups are more likely to be influenced and evaluating the effectiveness of intervention strategies.¹¹ Hu et al conducted a meta-analysis on older adults, synthesizing evidence on prevalence of misinformation exposure and the impact of interventions in this population.¹² Rocha et al systematically reviewed the role of social media in the spread of infodemics, noting that exposure to health-related fake news may cause psychological consequences such as panic, fear, depression, and fatigue.¹³

In the perspective of health misinformation detection, Schlicht et al provided a systematic review of computer science literature on automated health misinformation detection. They applied text mining and machine learning methods to develop a classification framework that analyzed studies by information sources, health topics, misinformation types, detection tasks, and technical methods.¹⁴ Ravichandran and Keikhosrokiani focused on the classification of misinformation on social media during the COVID-19 pandemic. They examined 4 categories of techniques—neuro-fuzzy systems, neural networks, natural language processing, and traditional machine learning. Their aim was to identify the types of classification methods applied in this field, evaluate the most effective neuro-fuzzy and neural network approaches, and analyze the strengths and limitations of these techniques.¹⁵

In summary, prior surveys on health misinformation detection have primarily concentrated on technical methods but remain incomplete. They often neglect advanced approaches such as knowledge graph–based techniques, fact-checking strategies, and large language models. Moreover, they rarely integrate interdisciplinary perspectives, for example the psychological impacts of misinformation or the vulnerabilities of specific populations. Compared with these reviews, this paper provides a more comprehensive survey. First, we analyze the characteristics of health misinformation in terms of its definition, dissemination mechanisms, psychological effects, and individual susceptibility. Second, we classify and summarize existing datasets for health misinformation detection. Third, we review evaluation metrics commonly used in this domain. Fourth, we discuss detection methods from 5 perspectives: machine learning, deep learning, knowledge graph–based approaches, fact-checking methods, and large language models. Finally, we highlight current limitations and outline directions for future research.

In this study, we conducted a literature search using the Google Scholar database. The inclusion criteria were as follows: (1) publications dated between January 2016 and February 2025, (2) articles written in English and available in full text, (3) studies proposing methods for detecting misinformation in the health domain, and (4) further validation in Web of Science to ensure that the selected literature is indexed in the Science Citation Index (SCI) or the Social Sciences Citation Index (SSCI), or belongs to an international conference recommended by the China Computer Federation (CCF).¹⁶ SCI and SSCI are prestigious citation indexes encompassing high-quality, peer-reviewed journals in the natural sciences and social sciences, respectively. Likewise, the CCF ranking system identifies high-impact international conferences in the field of computer science. By implementing this verification process, we ensure that the reviewed literature meets rigorous academic standards and maintains high credibility.

First, we set the search term with all of the words in Advanced Search as “health misinformation detection,” limited the publication timeframe to 2016 to 2025, and sorted the results by relevance. This search yielded 87 studies that met the inclusion criteria. Next, we set the search term with all of the words to “detection” and the search term with the exact phrase to “health misinformation,” maintaining the 2016 to 2025 timeframe. After removing duplicate results from the previous search, an additional 13 studies met the inclusion criteria. In summary, a total of 100 studies were included in this review, comprising 72 SCI or SSCI-indexed articles and 28 papers from international conferences recommended by the China Computer Federation (CCF). The distribution of publications over time is illustrated in Figure 1.

The main contributions of our work are summarized as follows:

The characteristics of health misinformation are summarized, including its concept, dissemination mechanisms, psychological impacts, and individual susceptibility.

A comprehensive analysis of existing binary and multi-class health misinformation detection datasets is conducted, and commonly used evaluation metrics for binary, multi-class, and imbalanced datasets are summarized.

Significant performance differences among various machine learning algorithms are identified, with the importance of ensemble methods, feature selection techniques, and embedding models in enhancing detection accuracy being emphasized.

The application of unimodal and multimodal deep learning-based approaches in health misinformation detection is explored, with their effectiveness, limitations, and future improvement directions being systematically examined.

Other advanced detection methods for health misinformation are reviewed, with their roles in improving detection accuracy and model interpretability, as well as the risks associated with AI-generated misinformation and ethical issues, being analyzed.

Figure 1.

Publication trends over time.

The structure of this paper is as follows: Section 2 introduces the characteristics of health misinformation. Section 3 presents existing health misinformation datasets. Section 4 introduces the evaluation metrics for assessing health misinformation detection effectiveness. Sections 5 to 7 discuss various health misinformation detection methods, including machine learning approaches, deep learning methods, knowledge graphs, fact-checking, and large language models. Section 8 provides a discussion, summarizing the findings of this study, its limitations, and potential directions for future research. Finally, the Conclusion section provides a summary of the research.

Characteristics of Health Misinformation

Concept of Health Misinformation

There is no universally agreed-upon definition of fake news, but it is commonly categorized into 2 main types: disinformation and misinformation.^17,18 The distinction between these terms primarily lies in their intent. Misinformation refers to information that is unknowingly false and is shared without malicious intent, whereas disinformation involves the deliberate dissemination of false information with the aim of causing harm.¹⁹ In addition, some scholars have expanded the classification of fake news to include a third category: malinformation.^20,21 Malinformation pertains to information that is truthful but is shared with the intent to harm or mislead, such as malicious rumors.²¹ Furthermore, alternative perspectives have been proposed regarding the concept of misinformation. Some scholars argue that it serves as an umbrella term encompassing both intentionally disseminated misleading information and unintentional misleading information.^22
-24

Due to the widespread presence of misinformation in the health domain, some scholars have sought to define the concept of health misinformation more precisely. Chou et al define health misinformation as a health-related claim that is based on anecdotal evidence, false, or misleading owing to the lack of existing scientific knowledge.²⁵ The definition includes both false information shared without intent to harm and information—whether false or reality-based—that is deliberately intended to cause harm to individuals, social groups, institutions, or countries.²⁶ Krishna and Thompson define health misinformation as the acceptance of false or scientifically inaccurate information despite exposure to accurate data, in the absence of accurate information, or within historical and contextual influences.²⁷ Similarly, Carlson argues that health misinformation includes biased or out-of-context health information, as well as false claims that are not evidence-based.²⁸ Swire-Thompson and Lazer define science and health misinformation as information that contradicts the epistemic consensus of the scientific community. Under this definition, perceptions of truth and falsehood evolve over time as new evidence emerges and scientific methods advance.²⁹ Zhong provides a broad definition, stating that health misinformation encompasses all forms of false or low-quality health information.³⁰ Darwish et al further specify that any post, tweet, or shared resource that misinterprets expert-accepted medical knowledge constitutes misleading health information, including fake news articles, memes, and social media posts containing false claims.³¹

To describe the rapid spread of misinformation, particularly on social media, the World Health Organization (WHO) introduces the term “infodemic.”³² The WHO defines an infodemic as the excessive proliferation of false or misleading information during a disease outbreak, occurring in both physical and digital environments. Such misinformation can cause public confusion, encourage risky behaviors, and ultimately undermine public health efforts.³³ It is characterized by the simultaneous spread of both true and false information.³⁴

Based on existing research, we identify the following characteristics of health misinformation: (1) it includes not only scientifically inaccurate false information but also true information that is intended to harm or mislead; (2) the veracity of information must be evaluated in the context of its dissemination, and it may evolve over time as new evidence emerges and scientific methodologies advance; (3) it manifests in various forms, including news articles, memes, and social media posts. From these characteristics, we identify 2 major challenges in detecting health misinformation. First, assessing the intent of information disseminators remains operationally difficult. Second, the dynamic nature of scientific consensus introduces an inherent relativity in defining misinformation. These challenges suggest that future research should develop a more context-sensitive and temporally adaptive conceptual framework to provide a solid theoretical foundation for automated detection systems.

Dissemination Mechanism of Health Misinformation

Health misinformation is widely disseminated on social media platforms and can lead to harmful consequences such as incorrect care seeking behavior, reduced trust in the health care system, and even the spread of disease. Understanding the transmission mechanism of health misinformation is very important for designing effective health misinformation detection algorithm. This chapter will explore several key aspects, including the mechanisms of health misinformation dissemination, the differences in dissemination characteristics between misinformation and true information.

Scholars have extensively studied the mechanisms of health misinformation dissemination on social media. Rodrigues et al identified several key factors contributing to the spread of health misinformation, including the amplification effects of social media algorithms, emotionally-driven dissemination, political influences, and a crisis of public trust.³⁵ Kalantari et al conducted a social network analysis of user interactions to identify accounts that played a central role in spreading COVID-19-related misinformation. Their findings indicate that news organizations and medical institutions serve as primary nodes in the dissemination network, while public figures and their supporters form communication clusters that significantly impact the spread of misinformation.³⁶ Zhou et al highlighted that misinformation related to health caution and advice, health help-seeking, and emotional support are significant determinants of individuals’ dissemination behavior.³⁷ Zhang et al found that content congruence and affective congruence between tweets and their corresponding comments significantly influence the spread of misinformation on Twitter.³⁸ Xue et al further revealed that misinformation originating from pseudo-authoritative sources and associated with negative sentiment was more likely to be disseminated, though accuracy cueing interventions were effective in reducing sharing intentions.³⁹ Zhao et al demonstrated that both highly active and low-activity users were prone to sharing misinformation, with social proof mechanisms (eg, likes and retweets) playing a critical role in its spread.⁴⁰ Ganti et al investigated the impact of narrative style on the spread of health misinformation on social media. Their findings revealed that misinformation presented in a narrative format, especially vaccine-related content, led to higher user engagement.⁴¹ These results highlight the complex factors driving the spread of health misinformation. Social media dynamics, user behavior, and information framing all play a crucial role in its proliferation.

When it comes to the differences in dissemination characteristics, the spread of health misinformation varies from that of truthful information. Safarnejad et al reconstructed the information dissemination networks during the Zika virus outbreak. They found that misinformation networks were more complex, had longer diffusion paths, and were more likely to form high-impact localized dissemination clusters.⁴² Zhong conducted a multilevel analysis of Twitter health news reports and found that while low-quality health information was more frequently discussed, it did not necessarily persist longer.³⁰ Osude et al observed that health misinformation spreads faster than factual information due to its emotionally charged, alarmist, and easily digestible nature.⁴³ Edinger et al found that misinformation spreads much faster than official public health information. In early tweets, negativity was the dominant sentiment. Their findings suggest that health misinformation thrives within complex communication networks. Its rapid diffusion and strong emotional appeal contribute to its widespread reach on social media.⁴⁴

In summary, the dissemination of health misinformation on social media is shaped by a combination of technological, psychological, and social factors. Algorithmic amplification, emotional framing, and the influence of key opinion leaders can significantly accelerate the spread of health misinformation, while social proof mechanisms further reinforce its visibility. Compared with truthful information, health misinformation tends to diffuse more rapidly, persist in more complex network structures, and attract higher levels of engagement due to its emotional appeal and narrative style. These findings underscore the urgent need for targeted interventions, such as accuracy cueing, content moderation, and the promotion of trustworthy health communication, in order to mitigate the harmful consequences of misinformation proliferation.

Psychological Impact of Health Misinformation

Exposure to health misinformation can lead to psychological and mental consequences. Banerjee et al reported that the digital infodemic is responsible for multiple psychological issues such as anxiety, fear, agitation, uncertainty, noncompliance to precautions, stigma, due to the huge misinformation load. The increased screen time and unhealthy technology use, found in the population at large, further contribute to the mental health problems.⁴⁵ Dubey et al found that health misinformation have generated an increase in the fear, frustration, and anguish of the population, resulting in a series of symptoms characteristic of mental disorders such as anxiety, phobia, panic, depressive behavior, obsession, irritability, delusions of having symptoms similar to COVID-19 and other paranoid ideas.⁴⁶ Wu et al reported that exposure to rumors, misinformation, and negative pandemic-related information was significantly associated with an increased risk of mental health problems, including depressive and anxiety symptoms, stress, and social isolation.⁴⁷

To assess the psychological impact of health misinformation, especially in the emotional aspects such as depression and anxiety, some scholars have applied validated psychological constructs and scales. Gao et al examined the association between social media exposure (SME) and the prevalence of mental health problems during COVID-19. Depression was assessed by the Chinese version of WHO-Five Well-Being Index (WHO-5) and anxiety was assessed by Chinese version of generalized anxiety disorder scale (GAD-7). They found a high prevalence of mental health problems, positively correlated with frequent SME during the outbreak.⁴⁸ Pedro et al examined the relationship between the frequency and mode of COVID-19-related information acquisition and psychological symptoms. The severity of depressive symptoms, using the Hamilton Depression Scale (HAM-D), Hamilton Anxiety Scale (HAM-A), and the Stress Symptoms Inventory adapted from the Checklist 90-R. Individuals who sought information on social media reported greater depressive symptom severity than those who did not, while those who used WhatsApp for information seeking had lower anxiety and stress levels.⁴⁹ Wu et al investigated media consumption patterns and their associations with depressive and anxiety symptoms among adults affected by the COVID-19 pandemic. Depression and anxiety were assessed with the Patient Health Questionnaire and the Generalized Anxiety Disorder Scale, respectively. They found that individuals who rarely used social media had the lowest prevalence of depression and anxiety, while those who rarely used traditional media had the highest prevalence of depression.⁴⁷ Hammad and Alqarni explored exposure to misleading social media news in relation to anxiety, depression, and social isolation among 371 Saudi participants. Using the Generalized Anxiety Disorder-7, the Centre for Epidemiological Studies Depression Scale, and the de Jong Gierveld Loneliness Scale, they found that misinformation exposure was positively associated with all 3 outcomes.⁵⁰

In summary, existing evidence demonstrates that exposure to health misinformation exerts significant psychological and psychiatric consequences. These impacts range from transient emotional responses, such as fear, worry, and agitation, to more persistent outcomes, including anxiety disorders, depression, social isolation, medical mistrust, and paranoid ideation. Importantly, empirical studies have operationalized these effects using validated psychological constructs and standardized clinical scales, confirming that misinformation not only shapes beliefs and attitudes but also produces measurable mental health symptoms. Collectively, these findings highlight that health misinformation acts as a psychosocial stressor, whose effects can be systematically evaluated within established psychiatric frameworks. Given the tangible emotional, cognitive, and psychiatric consequences, the identification, and detection of health misinformation are essential steps to mitigate its impact and protect public mental health.

Susceptibility to Health Misinformation

An individual’s susceptibility to health misinformation is shaped by multiple factors, with emotional and cognitive factors playing particularly important roles. Osude et al reported that upon exposure to health misinformation, cultural identity, psychological traits, and perceptions of illness susceptibility and severity influence the likelihood of believing such misinformation.⁴³ Pan et al found that health-related anxiety, preexisting misinformation beliefs, and repeated exposure—3 emotional and cognitive elements—were positively associated with health misinformation acceptance.⁵¹ Piksa et al proposed 4 psycho-cognitive phenotypes of information susceptibility: Consumers, Knowers, Duffers, and Doubters. The study found significant differences among phenotypes in feedback sensitivity, cognitive bias, Big Five personality traits, anxiety, narcissism, optimism, and reward–punishment sensitivity. These results suggest that susceptibility is shaped not only by truth-discernment ability but also by emotional stability, anxiety levels, and personality traits.⁵² Beauvais emphasized that heightened emotions, particularly fear and anger, can impair critical evaluation and increase susceptibility to misinformation, especially among individuals with pre-existing cognitive or emotional vulnerabilities. Confirmation bias and motivated reasoning were identified as key mechanisms reinforcing false beliefs.⁵³ Collectively, these findings are consistent with psychiatric perspectives on belief formation and maintenance, which highlight the interplay between emotional dysregulation, cognitive distortions, and personality-related vulnerabilities in shaping susceptibility.

In addition to these individual-level susceptibility mechanisms, certain populations are disproportionately vulnerable to health misinformation. Choukou et al defined vulnerable populations as the group of persons in need of special support adapted to their socioeconomic status, health needs or any context that prevents access to digital health information. Examples include illiterate, digitally illiterate, older adults, with visual or hearing impairments, with mental or cognitive impairments, living in remote or underserved communities, with limited access to the Internet, indigenous living on reserve, immigrants, having language barriers and with low socioeconomic status.⁵⁴ Scherer et al found that a person who is susceptible to online misinformation about 1 health topic may be susceptible to many types of health misinformation. Individuals who were more susceptible to health misinformation had less education and health literacy, less health care trust, and more positive attitudes toward alternative medicine.⁵⁵ Pan et al found that females were more likely than males to accept health misinformation, while age, education, and income were negatively associated with acceptance.⁵¹ Escolà-Gascón et al investigated the psychological and psychopathological profiles that characterize fake news consumption. Using the State–Trait Anxiety Inventory (STAI), Positive and Negative Affect Schedule (PANAS), and the Multivariable Multiaxial Suggestibility Inventory (MMSI-2) based on DSM-5, they found that individuals with schizotypal, paranoid, or histrionic personality traits were less effective at detecting fake news and more vulnerable to its negative effects, displaying higher anxiety and greater cognitive biases related to suggestibility and the Barnum Effect.⁵⁶ Perlis et al analyzed responses from 2 waves of a 50-state nonprobability internet survey conducted, in which depressive symptoms were measured using the Patient Health Questionnaire 9-item (PHQ-9). They found that individuals with moderate or greater depressive symptoms were more likely to endorse vaccine-related misinformation.⁵⁷

In summary, susceptibility to health misinformation is influenced by a complex interplay of emotional, cognitive, and personality-related factors, as well as broader sociodemographic conditions. Evidence consistently shows that vulnerable populations are disproportionately affected by health misinformation. This underscores the need for detection algorithms and intervention strategies that not only achieve high overall accuracy but also address the specific vulnerabilities of these populations. Incorporating human-centered and equity-driven perspectives will be essential to ensure that health misinformation detection systems are inclusive, effective, and capable of protecting those most at risk.

Datasets of Health Misinformation

Health misinformation originates from various online sources, including health-related websites, online health forums, and social media platforms.⁵ It generally appears in 2 main forms: news articles and user-generated content (UGC). UGC is a term used to describe any content created by users, including texts, videos, images, reviews, live streams, and other forms of media.⁵⁸ Compared with news articles, UGC is typically not peer-reviewed, lacks source citations,⁵⁹ and often contains misspellings, noise, and abbreviations.⁶⁰ To support the development of automated health misinformation detection systems, researchers have collected news articles and UGC from websites and social media platforms. They analyzed the characteristics of this content and compiled labeled datasets that form the foundation for detection tasks. Most studies frame the problem as binary classification, distinguishing between true and false information. Others take a more fine-grained approach, dividing misinformation into multiple categories to better capture its diverse characteristics. The following section outlines these datasets, and the classification methods are illustrated in Figure 2.

Figure 2.

Dataset categorization.

Binary Categorical Datasets

Most health misinformation detection datasets are categorized into 2 classes: true and false. In this paper, we further classify binary datasets into 3 subcategories: general health-related datasets, vaccine-related datasets, and COVID-19-related datasets. Detailed information on the binary datasets is presented in Appendix 1.

General health-related datasets provide a comprehensive foundation for research on health misinformation detection. Liu et al compiled a health-related dataset consisting of 2296 articles from reliable sources and 2085 from unreliable sources.⁴ Dai et al introduced the FakeHealth dataset, which comprises 2 subsets: HealthStory and HealthRelease. The dataset’s true and false labels were assigned based on 10 expert-evaluated news credibility criteria.⁶¹ Safarnejad et al collected English-language Twitter data about the Zika virus throughout 2016. They identified 264 highly influential misinformation tweets and paired them with 455 verified information tweets.⁶² Barve and Saini developed the CredHealth and FactHealth dataset by retrieving healthcare-related URLs from Google. The credibility of each URL was assessed using a scoring method, where URLs with a score above 9 were labeled as false and those scoring between 3 and 6 were labeled as true.⁶³ Martinez-Rico et al constructed the KEANE dataset by aggregating healthcare-related articles from fact-checking websites such as Health Feedback, Snopes, and Politifact, categorizing them as true or false.⁵ Liu et al developed the Verified Health Information (VHI) dataset, capturing corroborated cases from the food safety and healthcare sections of the Jiaozhen platform. The dataset includes 4313 false cases and 1388 true cases.⁶⁴ Li compiled a health misinformation dataset from Tencent’s fact-checking platform, comprising 10 560 health-related entries, of which 2150 were true and 8410 were false.⁶⁵

Vaccination is a fundamental pillar of public health, essential for preventing the spread of diseases and safeguarding both individuals and communities. As a prominent subset of health misinformation, vaccine-related misinformation poses significant risks to public health efforts. Du et al retrieved discussions on HPV vaccination from Reddit between 2007 and 2017 using Pushshift, randomly sampling 28 121 posts and categorizing them as misinformation or non-misinformation.⁶⁶ Hayawi et al introduced ANTiVax, a dataset containing 15 073 vaccine-related tweets collected via TwitterAPI, of which 5751 were classified as misinformation and 9322 as general vaccine-related content.⁶⁷ Joshi et al proposed the MiSoVac dataset, which collected data on misinformation related to the COVID-19 vaccine on social media sites. A total of 404 pieces of data were labeled true and 607 pieces of data were labeled false.⁶⁸

The COVID-19 pandemic has affected billions of people worldwide, causing widespread health, social, and economic disruptions.⁶⁹ Alongside the outbreak, a surge of COVID-19-related health misinformation has rapidly spread across the internet, misleading public perceptions of the crisis and potentially undermining efforts to control the pandemic. To address this challenge, numerous researchers have proposed datasets dedicated to COVID-19 misinformation detection.

The majority of COVID-19-related misinformation datasets are in English. Cui and Lee introduced CoAID (COVID-19 heAlthcare mIsinformation Dataset), which compiles news articles, user engagements and social platform posts. The dataset consists of 1788 fake claims, 18 079 fake news articles, 21 043 true claims, and 260 037 fake news articles.⁷⁰ Zhou et al developed ReCOVery, an English-language dataset comprising 2029 news articles, 140 820 tweets, and 93 761 users, all categorized as either reliable or unreliable.⁷¹ Al-Rakhami and Al-Amri retrieved COVID-19-related tweets using specific hashtags and keywords, subsequently annotating 121 950 tweets as credible and 287 534 as non-credible.⁷² Paka et al presented the first COVID-19 Twitter fake news dataset CTF. The dataset contains a total of 45.26K labeled tweets, among which 18.55K are labeled as genuine and 26.71K as fake. In addition, it contains 21.85M unlabeled tweets, which can be used to enrich the diversity of the dataset, in terms of linguistic and contextual features in general.⁷³ Patwa et al developed the Constraint@AAAI2021, which includes social media posts and news articles about COVID-19. The dataset contains 5600 entries labeled as true and 5100 as false, serving as a useful resource for studying misinformation.⁷⁴ Shang et al collected 891 English videos related to COVID-19 from TikTok, including 226 misleading videos and 665 none-misleading videos.⁷⁵ Khan et al used fake and real news articles about COVID-19 collected from multiple platforms. The dataset had 1164 instances, of which 586 were true and the remaining 578 were fake news.⁷⁶ Zamir et al used “COVID Fake News Dataset” from kaggle.com. The dataset was collected from news articles by crawling various websites and manually annotated by experts. The dataset contains 6420 tweets, of which 3360 are labeled as real and 3060 are labeled as fake.⁷⁷ Koirala collected global COVID-19 news articles using the Webhose.io tool. The final dataset contained 4072 articles, of which 2426 were labeled as True and 1646 as False.⁷⁸

In addition to English, some scholars have introduced some COVID-19 related datasets in other languages to enrich the corpus of health misinformation detection datasets. Li et al proposed the MM-COVID dataset containing fake news content, social engagements, and spatial-temporal information in 6 different languages.⁷⁹ Yang et al present CHECKED, a Chinese dataset comprising 2104 verified COVID-19-related tweets, of which 1760 are labeled true and 344 false. The dataset provides rich multimedia information for each microblog, including the ground-truth label and associated textual, visual, temporal, and network data.⁸⁰ Du et al collected and annotated a Chinese COVID-19 news dataset that examined more than 1,000 fact-checked news items, and collected 86 fake news and 114 real news.⁸¹ Ghayoomi and Mousavian presented the Persian COVID-19 fake news dataset, which was collected from social media, including Twitter and Instagram, as well as the websites of various Persian news agencies. It includes 265 real news articles and 265 fake news articles.⁸² Bozuyla and Özçift developed a Turkish dataset to identify the veracity of COVID-19 news. The data originated from Twitter, Turkish fact-checking websites, and COVID-19 dataset translated from English in Turkish. A total of 2110 COVID-19 samples were collected, comprising 1050 true news articles and 986 fake news articles.⁸³ Bonet-Jover et al presented the Spanish Reliable and Unreliable News (RUN) dataset, focusing on health and COVID-19. The RUN dataset contains 80 Spanish-language news items, of which 51 are reliable and 29 are unreliable.⁸⁴

In summary, the field of health misinformation detection has accumulated a substantial number of binary classification datasets, characterized by the following features. First, data sources are highly diverse, including social media posts, news articles, fact-checking platforms, and user-generated content, with labels assigned through expert annotation, scoring systems, or manual review. Second, research focus exhibits a domain-specific hierarchy: COVID-19-related datasets dominate, reflecting the direct impact of public health crises on research demand; vaccine-related datasets receive sustained attention due to their relevance to preventive medicine; and general health datasets provide foundational support for cross-domain studies. Notably, while English remains the dominant language, recent years have seen the emergence of multilingual datasets, expanding the geographical scope of research. Despite these developments, existing datasets continue to face persistent challenges. Widespread class imbalance often undermines model generalization, while inconsistencies in annotation standards hinder cross-dataset transfer learning and complicate comparative evaluations.

Multi-Categorical Datasets

In addition to binary classification, some researchers have proposed multi-categorical approaches to better capture the complexity and diversity of health misinformation. Multi-categorical datasets can generally be divided into 2 groups: those that classify misinformation into 3 categories, and those that adopt more fine-grained schemes with more than 3 categories. Detailed information on the multi-categorical datasets is presented in Appendix 2.

Several studies have adopted a 3-category classification scheme for health misinformation. Sicilia et al collected 709 samples related to Zika virus from Twitter, which the annotators manually labeled into 3 categories including rumor, non-rumor, and unknown.⁸⁵ Hossain et al proposed a benchmark dataset COVIDLIES containing 6761 expert-annotated tweets with Agree, Disagree, and No Stance labels.⁸⁶ Haouari et al introduced an Arabic COVID-19 Twitter dataset where each tweet was labeled with 3 categories: true, false and other.⁸⁷ Luo et al constructed a Chinese infodemic dataset for identifying misinformation by labeling the collected records as questionable, false, and true.⁸⁸ Cheng et al proposed a COVID-19 rumor dataset including rumors from Twitter and news websites, which were manually labeled into 3 categories: true, false, and unverified.⁸⁹ Sarrouti et al constructed a dataset HEALTHVER for evidence-based fact-checking of COVID-19 related statements, with each evidence-claim pair labeled supports, refutes, or neutral.⁹⁰ Srba et al presented a dataset for the study of medical misinformation containing approximately 317 000 medical news articles and 3500 fact-checked statements, as well as mappings between articles and statements. Claim-stance pair were labeled as supporting, contradicting, and neutral.⁹¹ Nabożny et al extracted more than 10 000 sentences from 247 online medical articles, labeled by medical experts as credible, non-credible and neutral.²³

Other researchers have employed more fine-grained categorizations, dividing health misinformation into more than 3 categories. Kim et al introduced the FibVID dataset, which contains news statements and related tweets about the COVID-19 pandemic. Each entry is labeled into 1 of 4 categories: COVID True, COVID Fake, non-COVID True, and non-COVID Fake.⁹² Zhao et al extracted records from the “Autism Forum” of Baidu PostBar and coded them into 5 categories: Advertising, Propaganda, Misleading information, Unrelated information, and Legitimate information.⁹³ Memon and Carley proposed the CMU-MisCOV19 dataset, using Twitter API to collect 4573 COVID-19-related tweets. The tweets were manually annotated into 17 categories, including Irrelevant, Conspiracy, True Treatment, True Prevention, Fake Cure, and so on.⁹⁴ Shahi and Nandini presented a multilingual FakeCOVID dataset containing 7623 news articles related to COVID-19 and the dataset was annotated into 23 categories including True, Mostly True, Partially True, Mixture, Misleading, and so on.⁹⁵ Dharawat et al proposed Covid-HeRA, a dataset of 61 286 COVID-19-related tweets collected from Twitter. The tweets were annotated by regular users into 5 categories: Possibly Severe, Highly Severe, Refutes/Rebuts, Other, and Real News/Claims.⁹⁶

In the field of health misinformation detection, the construction of multi-class datasets reflects the need for fine-grained identification of complex information types. Existing studies primarily adopt a 3-class framework, leveraging expert or manual annotation to develop datasets from social media, news platforms, and medical forums, covering public health events such as the Zika virus and COVID-19. Some researchers have further expanded classification dimensions, introducing 4- to 5-class or even more granular multi-label systems to capture the diversity of health information. These datasets exhibit significant variation in linguistic diversity, data scale, and annotation methods, highlighting the dynamic adaptation of misinformation definitions and classification standards across different research contexts. Future studies should explore unified annotation frameworks for cross-domain and multilingual datasets while enhancing model robustness in handling complex classification schemes.

Evaluation Metrics for Health Misinformation Detection

To assess the effectiveness of misinformation detection methods in classification tasks, researchers commonly adopt standardized, statistically grounded performance metrics. These indicators, derived from confusion matrices, enable consistent evaluation, and comparison of model performance across studies.

A confusion matrix, also known as an error matrix, is a table that visually depicts the performance of a supervised classification machine learning system.⁹⁷ It compares the model’s predicted labels with the actual ground truth labels, and consists of 4 key components: true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN), as shown in Table 1.

Table 1.

Confusion Matrix.

Actual\Predicted	Predicted true	Predicted false
Actual true	True positive (TP)	False negative (FN)
Actual false	False positive (FP)	True negative (TN)

Based on these values, several widely used evaluation metrics are calculated:

Accuracy: Measures the proportion of correctly predicted samples relative to the total number of samples.

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(1)

Precision: Represents the proportion of truly positive samples among those identified as positive.

Precision = \frac{T P}{T P + F P}

(2)

Recall: Captures the proportion of correctly identified positive samples among all actual positive cases.

Recall = \frac{T P}{T P + F N}

(3)

F1-score: Combines precision and recall into a single metric, providing a balanced assessment of model performance.

F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(4)

The Receiver Operating Characteristics (ROC) curve shows the success of a classification model across several classification thresholds. True Positive Rate (Recall) and False Positive Rate (FPR) are used in this curve.⁹⁸ AUC is an abbreviation for “Area Under the ROC curve,” it reflects the classifier’s ability to distinguish between positive and negative samples across various classification thresholds. The FPR can be defined as in equation (5).

FPR = \frac{F P}{F P + T N}

(5)

For multi-class classification tasks, performance is typically evaluated using macro-averaged metrics, which compute the mean of the per-class performance scores:

M a c r o P r e c i s i o n = \frac{1}{n} \sum_{i = 1}^{n} \frac{T P_{i}}{T P_{i} + F P_{i}}

(6)

Macro Recall = \frac{1}{n} \sum_{i = 1}^{n} \frac{T P_{i}}{T P_{i} + F P_{i}}

(7)

Macro F 1 Score = \frac{2 P_{m} R_{m}}{P_{m} + R_{m}}

(8)

Here, $n$ denotes the number of classes, and $T P_{i}$ , $F P_{i}$ , $T N_{i}$ , and $F N_{i}$ represent the numbers of true positives, false positives, true negatives, and false negatives for class $i$ , respectively.

However, when dealing with highly imbalanced datasets, these conventional metrics may fail to accurately reflect model performance. In such cases, additional evaluation indicators are recommended:

Matthews Correlation Coefficient (MCC) measures the correlation between the actual and the predicted values of the instances.

M C C = \frac{T N . T P - F N . F P}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(9)

Cohen’s kappa (k) compares how well the binary classifier performs compared to the randomized accuracy $p_{e}$ .

k = \frac{A c c u r a c y - p_{e}}{1 - p_{e}}

(10)

p_{e} = \frac{(T P + F N) (T P + F P) + (T N + F P) (T N + F N)}{{(T P + T N + F P + F N)}^{2}}

(11)

Brier Loss measures the mean squared error between the predicted probabilities and the actual outcomes.

B r i e r L o s s = \frac{1}{N} \sum_{i = 1}^{N} {({\hat{y}}_{i} - y_{i})}^{2}

(12)

Specifically, ${\hat{y}}_{i}$ denotes the predicted probability that a sample belongs to a certain class, while $y_{i}$ represents the true class label of the sample.

In summary, a comprehensive evaluation of misinformation detection models requires both traditional and specialized performance metrics. Standard indicators such as accuracy, precision, recall, F1-score, and AUC remain fundamental. Yet, for imbalanced data scenarios, metrics like MCC, Cohen’s Kappa, and Brier Loss provide essential complementary insights, ensuring a more reliable and nuanced assessment of model performance.

Machine Learning Detection Methods

The rapid proliferation of health misinformation has necessitated the development of automated detection systems to mitigate its impact. In this context, machine learning (ML) and artificial intelligence (AI) techniques have emerged as pivotal tools for identifying misinformation, particularly across social media platforms.⁶⁸ A growing body of research demonstrates the efficacy of ML-based approaches in filtering and categorizing misleading content, with applications ranging from general fake news detection to health-specific misinformation analysis.^9,99 The general framework for health misinformation detection based on machine learning is illustrated in Figure 3.

Figure 3.

General framework of machine learning-based health misinformation detection.

Performance Evaluation Among Machine Learning Algorithms

Health misinformation detection is fundamentally a text classification task, and many researchers employ traditional ML algorithms to detect misinformation. The effectiveness of these algorithms depends on task-specific feature engineering.¹⁰⁰ This section reviews the literature on various ML models applied to health misinformation detection and evaluates their relative performance. A summary of baseline ML algorithms used in the reviewed literature and their functions is presented in Appendix 3.

Among traditional ML algorithms, Random Forest (RF) consistently demonstrates strong performance across health misinformation detection tasks. Zhao et al developed a feature set incorporating both central and peripheral-level attributes for detecting misinformation in online health communities. Their RF model achieved an F1-score of 0.848, outperforming other classifiers.⁹³ Similarly, Safarnejad et al detected Zika virus misinformation by combining dissemination network features, retweet signal features, content-based features, and user features. Using all feature categories as input, the RF model achieved the best performance with an F1-score of 0.859.⁶² Khan et al employed a named-entity recognition (NER) approach to extract 39 linguistic and affective features from a COVID-19 dataset. When training multiple ML classifiers, the RF model exhibited the highest F1-score of 0.888.⁷⁶

Beyond RF, ensemble learning methods such as Gradient Boosting Decision Trees (GBDT) and AdaBoost have also demonstrated high efficacy. Liu et al analyzed health misinformation in Chinese social media using 104 linguistic and statistical features. Their experiments with multiple ML models revealed that GBDT achieved the highest precision of 0.837.⁴ Al-Rakhami and Al-Amri utilized 6 machine learning algorithms and integrated ensemble learning techniques. They extracted tweet-level and user-level features to investigate the credibility of COVID-19 misinformation on Twitter. The best-performing ensemble model, with SVM and RF as base models and C4.5 as the meta-model, achieved an accuracy of 0.978.⁷² Amjad et al constructed a benchmark dataset for Urdu fake news detection across 5 domains, including health. They extracted n-gram features such as character n-grams, word n-grams, and function n-grams. Among 7 machine learning classifiers, the AdaBoost achieved the highest performance, with an F1 score of 0.90 for real news.¹⁰¹

Evolutionary algorithms have also been explored in health misinformation detection. Al-Ahmad et al proposed an evolutionary approach for COVID-19 fake news detection. They used 3 metaheuristic algorithms: Binary Genetic Algorithm (BGA), Binary Particle Swarm Optimization (BPSO), and Binary Salp Swarm Algorithm (BSSA). These algorithms helped reduce redundant features and improve detection accuracy. Among them, KNN-BGA achieved the highest accuracy of 75.43% while significantly minimizing feature redundancy.¹⁰² Qasem et al leveraged a stacked classifier (LR combined with Genetic Algorithm-SVM) to detect COVID-19 rumors in Arabic, achieving an accuracy of 0.926.¹⁰³ Nabożny et al employed LR and Recursive Feature Elimination for feature selection. They optimized models using the Tree-based Pipeline Optimization Tool (TPOT) library, which applies a genetic algorithm to a pool of models including LR, XGBoost, and MLP. This approach enhanced the efficiency of evaluating unreliable medical statements.²³

Several studies have also examined the impact of different feature representations across ML models. Nistor and Zadobrischi employed DT, NB, and KNN to classify fake news spread on social media during the COVID-19 epidemic, achieving 99.63% accuracy using a Term Frequency-Inverse Document Frequency (TF-IDF) vector combined with Python and JavaScript.¹⁰⁴ Mazzeo et al analyzed COVID-19 misinformation on search engines by extracting text and URL features and comparing the performance of ML algorithms, including SVM, stochastic gradient descent (SGD), LR, NB, and RF. Their findings indicated that when oversampling the data, Naive Bayes based on the bag-of-words (BoW) model and URL features was the most effective classifier, achieving the highest F1-score of 0.81.¹⁰⁵ Dhoju et al tested multiple classical ML methods for health article reliability detection, including MNB, LSVC, RF, and LR. They experimented with 3 feature combinations: words or n-grams (W), extracted features (EF), and W + EF. LSVC outperformed other models in all combinations, achieving a macro F-measure of 96% with the W + EF feature combination.¹⁰⁶ Alsmadi et al integrated multiple COVID-19 misinformation datasets to compare the performance of ML models on individual versus aggregated datasets. They also tested various word and sentence embedding models, including W2V, Glove, and BERT. Their findings revealed that integrating datasets mitigated class imbalance and improved model robustness, while word-embedding techniques consistently enhanced classification accuracy across all evaluated classifiers.³³

In summary, traditional ML algorithms exhibit significant performance differences and applicability in the task of health misinformation detection. Ensemble learning methods, such as RF, GBDT, and AdaBoost, further enhance classification performance by leveraging the strengths of multiple base models, particularly excelling in handling high-dimensional features and complex data distributions. Additionally, evolutionary algorithms significantly improve detection accuracy and reduce feature redundancy by optimizing feature selection and model parameters. The impact of feature representation methods, such as TF-IDF, BoW, and embedding models, on model performance has also been extensively validated, with embedding-based feature representations demonstrating remarkable improvements in classification accuracy. These findings highlight the importance of selecting appropriate classification algorithms, optimization techniques, and feature representation methods to achieve optimal performance in health misinformation detection, offering valuable insights for future research and practical applications in this domain.

Comparative Analysis of Machine Learning and Other Models

While ML has been widely applied in misinformation detection, other models, particularly DL models, have also attracted significant attention for this task. This section presents a comparative analysis of ML and alternative models, highlighting their respective strengths and limitations. A summary of key DL algorithms discussed in the reviewed literature and their functionalities is provided in Appendix 4.

In some cases, ML algorithms outperform other models. Sicilia et al proposed a method to detect rumors in health-related Twitter posts within a single topic domain. Their approach introduced measurements of influence potential and network features to compare multiple algorithms (MLP, Nearest Neighbor, SVM, Random Tree, Multiclass Adaboost, and RF) through a wrapper method. The results revealed that the RF classifier achieved the highest accuracy, correctly identifying approximately 90% of rumors with acceptable precision.⁸⁵ Mahara et al compared 5 traditional ML techniques—DT, RF, SVM, AdaBoost-DT, and AdaBoost-RF—with 2 hybrid DL approaches—CNN-LSTM and CNN-Bi-LSTM—using the Fake News Healthcare dataset. The AdaBoost-RF model, which combined news content and readability features, achieved the highest performance, with an F1 score of 0.989, indicating its greater suitability for practical applications compared to complex deep learning models.¹⁰⁷ Sharma and Garg explored five ML algorithms (DT, NB, LR, RF, and KNN) and two DL algorithms (LSTM and Bi-LSTM) for classifying COVID-19 fake news text in India. Additionally, they employed 2 convolutional neural networks (VGG-16 and ResNet-50) for image classification. Among the text classifiers, RF achieved the highest accuracy (94%), while ResNet-50 achieved 76.6% accuracy in image classification when the image size was set to 256x256.¹⁰⁸ Bonet-Jover et al introduced the Spanish-language RUN dataset, which focuses on health and COVID-19 misinformation. They tested multiple ML algorithms, including SVM, RF, LR, DT, AdaBoost, and Gaussian Naive Bayes, along with deep learning models like MLPs. Using a fine-grained labeling scheme based on journalistic techniques, they found that the RUN-AS labeled DT algorithm performed best, achieving a macro F1 score of 0.948.⁸⁴ Madani et al developed a coronavirus-related fake news detection model using Spark, Tweepy, and several ML algorithms (LR, DT, RF, NB, Gradient-Boosted Tree, SVM), alongside one DL model (MLP). The RF algorithm outperformed all other models, achieving an accuracy of 0.79 and a recall of 100%.¹⁰⁹ Alhakami et al compared 6 machine learning models (LR, NB, SVM, RF, KNN, DT) and 2 deep learning models (CNN, LSTM) for detecting COVID-19 misinformation. The machine learning models employed a BoW representation with TF–IDF features, while the deep learning models utilized GloVe word embeddings. Across 2 datasets, the ML models consistently outperformed the DL models and demonstrated lower computational costs.¹¹⁰ Narra et al introduced a fake news detection method using feature extraction techniques (TF-IDF, BoW) and feature selection methods (PCA, Chi-square). They evaluated seven ML models including RF, Extra Tree, Gradient Boosting Machine, LR, NB, stochastic gradient (SG), and a voting classifier comprising LR and SG alongside four DL models (CNN, LSTM, ResNet, and InceptionV3). Their findings revealed that the Extra Trees (ET) classifier achieved the highest accuracy (94.74%) when combining TF-IDF and BoW. The study also demonstrated that thorough text preprocessing significantly improved detection accuracy.¹¹¹

In contrast, there are cases where other models outperform ML models. Du et al evaluated the performance of traditional ML algorithms (SVM, LR, and Extremely Randomized Trees) against two DL models (CNN and RNN) for identifying misinformation about the HPV vaccine. The ML models employed TF–IDF for feature extraction, whereas the DL models utilized GloVe embeddings for feature representation. Among these algorithms, CNN achieved the best performance, with an AUC of 0.794.⁶⁶ Shams et al proposed SEMiNExt, an extension for a real-time health misinformation notifier. They compared six ML algorithms with an ANN-based DL approach. The results showed that SEMiNExt, using the ANN, achieved the best performance, with an F1 score of 92%.¹¹² Hayawi et al tested XGBoost, LSTM, and BERT models for classifying COVID-19 vaccine misinformation. The results revealed that BERT outperformed the others with an F1 score of 0.98.⁶⁷ Alghamdi et al explored several ML models and fine-tuned pre-trained transformer models, including BERT and COVID-Twitter-BERT (CT-BERT), for detecting COVID-19 fake news. Incorporating Bi-GRU on top of the CT-BERT model yielded state-of-the-art performance, with an F1 score of 0.985.¹¹³ Dai et al compared Random guess, Unigram, Unigram-News Source, Unigram-Tags, SVM, RF, CNN, Bi-GRU, and SAF (Social Article Fusion) for misinformation detection on the FakeHealth dataset, with SAF emerging as the top performer, achieving an accuracy of 0.810.⁶¹ Iwendi et al proposed 39 features, including sentiment, linguistic, and named entity recognition features, for detecting COVID-19 misinformation. They compared the performance of three machine learning algorithms (AdaBoost, DT, and KNN) and three deep learning algorithms (GRU, LSTM, and RNN). After feature extraction, the deep learning models consistently outperformed the machine learning models. Among them, GRU achieved the highest accuracy of 86.12%.¹¹⁴ Bangyal et al compared seven ML algorithms with six DL algorithms for detecting COVID-19 misinformation, with Bi-LSTM and CNN achieving the highest accuracy of 97%.¹¹⁵ Tomaszewski et al tested CNN, Bi-LSTM, SVM, and NB for detecting HPV vaccine misinformation, finding CNN to have the best overall performance, with an accuracy of 0.920.¹¹⁶ Albalawi et al compared traditional machine learning algorithms with deep learning architectures on Arabic COVID-19-related Twitter data. The Bi-LSTM model, combined with Mazajak skip-gram pre-trained word embeddings, achieved the best performance with an F1 score of 75.2%.¹¹⁷ Dharawat et al proposed a categorization framework for detecting the severity of health misinformation. They evaluated several models, including RF, SVM, LR, Bi-LSTM, CNN, BERT, Hierarchical Attention Networks, and dEFEND. The BERT model, enhanced with data augmentation and a customized loss function, achieved the best performance with an F1 score of 0.41. This result suggests that distinguishing the severity of misinformation is more challenging than simply classifying it as true or false.⁹⁶ Al-Yahya et al collected datasets such as ArCOV19-Rumors and Covid-19-Fakes to study Arabic fake news detection. They evaluated linear models, deep learning models (CNN, RNN, GRU), and Transformer-based models (AraBERT, QARiB, etc.). Results show that Transformer-based models outperform neural networks, with QARiB achieving the highest F1 scores and accuracy.¹¹⁸ Abdelminaam et al compared the performance of six traditional ML models and two DL models (Modified LSTM and Modified GRU) in detecting COVID-19 fake tweets. The traditional ML models employed TF-IDF and n-grams for feature analysis, while the deep learning models leveraged word embeddings. Optimization was performed using grid search for ML models and Keras tuning for DL models. The two-layer LSTM achieved the best test results, achieving an accuracy of 98.6%.¹¹⁹

Both ML and other algorithms have their inherent strengths, which vary depending on the nature of the dataset and the task at hand. Di Sotto and Viviani identified 6 types of health misinformation features to compare the performance of ML and DL methods across different datasets. Their experimental results revealed that the optimal classifier-feature configurations differ for each dataset. While DL solutions are highly effective when utilizing word-embedded features alone, the importance of ML classifiers becomes more apparent when additional types of features are considered.¹²⁰ Elhadad et al proposed a voting-based integrated ML classifier using 10 algorithms: DT, MNB, BNB, LR, KNN, Perceptron, ANN, LSVM, Ensemble RF, and XGBoost. Their results showed that the ANN classifier excelled in binary classification tasks. The DT classifier helped reduce misclassification of real documents as misleading. Meanwhile, the LR classifier minimized the misclassification of misleading documents as real.¹²¹

In summary, the performance of algorithms for health misinformation detection is highly context-dependent. Traditional machine learning methods remain competitive when feature engineering is mature, computational resources are limited, or a balance between interpretability and accuracy is necessary. Ensemble strategies can further improve classification performance by combining diverse features. Neural network-based approaches excel in capturing high-dimensional semantic relationships and are well-suited for cross-lingual or multimodal tasks. However, these methods often require significant computational resources, with performance gains coming at the cost of efficiency. The choice of detection methods should consider task-specific demands, dataset characteristics, and available resources. Future research should explore hybrid models that combine the advantages of traditional machine learning and neural networks to improve the accuracy and robustness of health misinformation detection systems.

Deep Learning Detection Methods

Traditional ML algorithms rely on manually engineered features for classification, their performance is often constrained by the quality and comprehensiveness of these features. In contrast, deep learning (DL)—a subset of ML based on deep neural networks—eliminates the need for manual feature extraction by leveraging the self-learning capability of network layers to automatically extract textual features from raw data, thereby achieving superior performance in natural language processing tasks such as text classification.^66,122 We will introduce DL methods for health misinformation detection from 2 perspectives: unimodal misinformation detection and multimodal misinformation detection. The general framework for health misinformation detection based on deep learning is illustrated in Figure 4.

Figure 4.

General framework of deep learning-based health misinformation detection.

Unimodal Misinformation Detection

Unimodal learning refers to learning using data from only 1 modality, while multimodal learning refers to learning using data from multiple modalities simultaneously. In order to improve the performance of health misinformation detection, several scholars have used different DL algorithms to deeply explore the potential of unimodal data.

Traditional Deep Learning Models

Traditional deep learning models encompass a range of neural network architectures that have been widely utilized for various ML tasks, including misinformation detection. Among these, Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) are 2 of the most prominent. CNNs are deep feedforward networks characterized by a shared-weights architecture and translation invariance, originally designed for image analysis. However, recent research has demonstrated their effectiveness in sequential data analysis,¹²³ highlighting their adaptability to diverse data types, including textual misinformation detection. On the other hand, RNNs are specifically designed for processing sequential data by maintaining contextual information through their recurrent connections. Variants such as LSTM and GRU have been developed to address issues like vanishing gradients, making RNNs particularly effective for handling time-series and textual data.

CNN-based models have demonstrated significant effectiveness in detecting misinformation, particularly in health-related domains. Kaliyar et al introduced a multi-channel deep CNN that integrates three parallel one-dimensional convolutional network (1D-CNN) channels. Each channel applies filters of different sizes to capture expressions of varying lengths. The model employs GloVe for word embeddings and was evaluated on two health-related datasets. Experimental results show that the proposed CNN model is effective in distinguishing health misinformation.¹²⁴ Luo et al examined RNN, CNN, and FastText on their proposed Chinese infodemic dataset. Their findings revealed that fastText achieved superior results in identifying false records, while CNN surpassed the other two models in accurately recognizing real records. Overall, CNN proved to be the optimal choice for misinformation classification in this study.⁸⁸ Kumar et al applied TF-IDF for feature extraction and Modified Grasshopper Optimization (MGO) for feature selection, with CNN then extracting n-gram features for classification. Their OptNet-Fake model achieved 98.43% accuracy on the COVID-19 Fake News dataset.¹²⁵

RNNs and their advanced variants, including LSTM and GRU, have been widely used to process sequential text data for fake news detection. Chen et al compared the performance of LSTM, GRU, and Bi-LSTM in detecting COVID-19 misinformation. The study found that Bi-LSTM achieved the best detection performance, with accuracy rates of 94%, 99%, and 82% on English short texts, English long texts, and Chinese texts, respectively.¹⁷ Chen and Lai proposed a COVID-19 misinformation detection framework that integrates fuzzy clustering and DL. They used Fuzzy C-Means (FCM) clustering to filter text features before classifying them with Bi-LSTM, GRU, and LSTM. Their results showed that Bi-LSTM achieved 99% accuracy on English datasets and 86% on Chinese datasets. Fuzzy clustering reduced detection time by 10 to 15%, but caused an 8% drop in accuracy for Chinese datasets. This approach improves detection efficiency while maintaining high classification accuracy, making it suitable for large-scale misinformation screening.¹²⁶

In the field of health misinformation detection, CNNs and RNNs, along with their variants, have emerged as 2 pivotal deep learning models, each demonstrating notable strengths and robust performance due to their unique architectures. CNNs excel at capturing local features in data through convolutional operations, while RNNs and their variants are particularly effective in handling sequential textual data, adept at modeling temporal dependencies and contextual information. To further enhance model performance, researchers have employed a variety of strategies. In the stage of text vector representation, word embedding methods such as GloVe and Word Frequency are widely adopted to transform textual data into vector forms. In terms of feature selection, optimization algorithms and clustering techniques are utilized to eliminate redundant features and reduce data dimensionality, which not only lowers computational overhead but also improves the stability and accuracy of model training. In summary, CNNs and RNNs, each with their respective advantages, play complementary roles in health misinformation detection. Combined with diverse text processing techniques, they provide a strong foundation for building efficient and accurate health misinformation detection systems.

Attention-Based Deep Learning Models

Self-attention, sometimes called intra-attention is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence. Self-attention has been used successfully in a variety of tasks including reading comprehension, abstractive summarization, textual entailment, and learning task-independent sentence representations. Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence aligned RNN or convolution.¹²⁷ BERT is a pre-trained language model based on the Transformer architecture. It is the first fine-tuning-based representation model to achieve state-of-the-art performance on a large number of sentence-level and token-level tasks, outperforming many task-specific architectures.¹²⁸

Given the effectiveness of attention-based models, numerous studies have leveraged these mechanisms for health misinformation detection. Paka et al introduced Cross-SEAN, a neural attention model that incorporates tweet text, tweet features, user features, and external knowledge. It uses cross-stitch units to share parameters between tweet and user features, with outputs connected at both early and late network stages to enhance information sharing. On a large-scale COVID-19 Twitter dataset, it achieved an F1 score of 0.953, surpassing the benchmark by over 9%.⁷³ Ding et al proposed EvolveDetector to detect fake news in emerging events. The model employs Word2Vec and Text-CNN for feature extraction. A hard-attention-based knowledge storage mechanism is designed to record neuron parameters through knowledge memory and event masks, reducing interference from dissimilar events. The event mask is used to assess event relationships, and a multi-head self-attention mechanism is leveraged to filter similar historical events for knowledge transfer. Experimental results on COVID-19 and other datasets demonstrate that the proposed model outperforms baseline models, especially in generalizing to new events.¹²⁹

With the rise of pretrained language models (PLMs), advanced neural architectures such as BERT have been widely adopted for misinformation detection.¹³⁰ Studies have explored various strategies to enhance model robustness, including fine-tuning, ensemble learning, domain adaptation, and interpretability techniques. For instance, Morita et al employed a fine-tuned BERTBASE (uncased) model with a tanh classifier to classify public health and political misinformation, addressing class imbalance using the Matthews correlation coefficient (MCC=0.819).¹³¹ Dementieva and Panchenko proposed a Cross-Lingual Evidence (CE)-based approach, integrating cross-lingual news retrieval with content similarity analysis. By combining CE features with pre-trained models like BERT and RoBERTa, their method significantly enhanced detection performance. On the ReCOVery dataset, RoBERTa with CE features achieved an F1 score of 0.975, demonstrating the power of cross-lingual evidence in misinformation identification.¹³² To further improve performance, ensemble approaches have been employed, such as Malla and Alphonse, who combined RoBERTa and CT-BERT with fusion vector multiplication technique, achieving exceptional accuracy (98.88%) and F1 score (98.93%) on COVID-19 fake news dataset.¹³³ Similarly, Das et al employed several pre-trained language models, including RoBERTa and XLNet, as base detectors. Their predictions were combined through soft voting, and the resulting ensemble was further integrated with a Statistical Feature Fusion Network (SFFN). The SFFN incorporated metadata-based statistical features such as URL domains, usernames, and news authors. In addition, they applied heuristic post-processing, which adjusted the model’s initial predictions based on a set of rule-based heuristics. Their approach was applied to a COVID-19 fake news dataset and achieved a best F1-score of 0.989.¹³⁴ Kumar et al proposed an ensemble deep learning model that integrates CT-BERT, BERTweet, and RoBERTa. The model combines the predictions of the three base models through class probability averaging. It achieved a weighted F1 score of 0.99 in detecting COVID-19 fake news.¹³⁵ Malla and Alphonse introduced the MVEDL model, which uses majority voting with RoBERTa, CT-BERT, and BERTweet classifiers, achieving accuracy and F1 scores of 91.75% and 91.14%, respectively.¹³⁶ Beyond ensemble learning, domain adaptation and multitask learning have proven effective in enhancing misinformation detection. Abd Elaziz et al employed a pre-trained AraBERT model for feature extraction using fine-tuning and multi-task learning. They used an improved Fire Hawk Optimizer (FHO) for feature selection. Their method enhanced the classification performance on an Arabic COVID-19 misinformation dataset.¹³⁷ Qasim et al evaluated nine pre-trained BERT-based models on three datasets: COVID-19 fake news, COVID-19 English tweets, and extremism detection. RoBERTa-base performed best for COVID-19 fake news detection, while BART-large excelled on COVID-19 English tweets.¹³⁸ In addition, model interpretability has attracted increasing attention. Ayoub et al proposed an explainable NLP framework that combines DistilBERT with SHAP (SHapley Additive exPlanations). This approach achieved strong performance in detecting COVID-19 misinformation, with an accuracy of 0.938, while also enhancing transparency in classification.¹³⁹

In summary, attention-based deep learning models, particularly those built upon Transformer architectures such as BERT, have demonstrated significant effectiveness in the task of health misinformation detection. These models leverage self-attention mechanisms to capture complex contextual relationships within text, enabling superior performance across a range of datasets and misinformation types. Research has explored diverse approaches to further enhance model performance and generalizability, including the integration of tweet- and user-level features, knowledge memory mechanisms, ensemble learning strategies, domain adaptation, and interpretability techniques. The adoption of pre-trained language models such as BERT, RoBERTa, CT-BERT, and AraBERT has proven particularly impactful, often surpassing traditional architectures and benchmark baselines. Moreover, the growing emphasis on explainability reflects a shift toward not only improving predictive accuracy but also ensuring model transparency and trustworthiness. Collectively, these advancements highlight the evolving landscape of attention-based methods in combating health-related misinformation, especially in the context of COVID-19.

Graph Neural Networks

A graph neural network (GNN) is a novel technique that focuses on using DL algorithms over graph structures.¹⁴⁰ Many methods have been implemented to detect and prevent the spread of fake news over the past decade, among which the GNN-based approach is the most recent.¹⁴¹ Existing approaches for fake news detection focus almost exclusively on features related to the content, propagation, and social context separately in their models. GNN promise to be a potentially unifying framework for combining content, propagation, and social context-based approaches.¹⁴² GNN-based methods for misinformation detection can be categorized into 4 groups: conventional GNN-based, GCN-based (Graph Convolutional Network), AGNN-based (Attention Graph Neural Network), and GAE-based (Graph Autoencoder) approaches.¹⁴¹

Recent studies have explored various GNN architectures and training strategies to improve fake news detection performance. Liao et al proposed the Fake News Detection Multi-Task Learning (FDML) model, which integrates textual content and contextual information through a news graph (N-Graph). The model employs a graph convolutional network (GCN) within a multi-task learning framework to jointly address fake news detection and topic classification. Compared to state-of-the-art models, FDML achieved about a 6% improvement in both average accuracy and Macro-F1 score.¹⁴³ Karnyoto et al constructed word–word and word–document nodes within a graph structure. They applied data augmentation techniques in combination with various GNN models, including GCN, Graph Attention Networks (GAT), and GraphSAGE (Sample and Aggregate), to detect COVID-19-related fake news. Experimental results showed that GraphSAGE, when combined with data augmentation, achieved the highest accuracy and F1-score among all evaluated models.¹⁴⁴ Cui et al constructed a heterogeneous graph with publisher, news, and user nodes, along with multiple edge types. Node features were extracted, and 2 meta-paths were defined to capture contextual information. The model used encoding, aggregation, and semantic fusion to learn effective news representations. A cross-entropy loss function was applied to update model parameters. Evaluations on the FANG and FakeHealth datasets showed that Hetero-SCAN outperformed various baseline methods, addressing the challenge of using multi-level social context and temporal information for fake news detection.¹⁴⁵ Min et al developed a dataset comprising multi-topic news, a large number of posts, and user social relationships, with 2 topics related to health. They framed social-context-based fake news detection as a heterogeneous graph classification task and introduced the Post-User Interaction Network (PSIN) model. They also used an adversarial topic discriminator to encourage the model to learn topic-independent features, improving its generalization to emerging events. Experimental results in both in-topic and out-of-topic settings showed that PSIN significantly outperformed all state-of-the-art baselines.¹⁴⁶

Overall, GNNs offer a unified framework for integrating content features, propagation patterns, and social context, thanks to their structured modeling capabilities. They have shown significant advantages in misinformation detection. Recent studies have improved GNN performance through several innovative strategies. At the data level, augmentation techniques are used to enhance model robustness. In graph structure design, multi-type nodes and meta-paths help capture heterogeneous relationships. At the algorithmic level, the use of graph attention mechanisms, hierarchical aggregation functions, and adversarial training enhances feature representation.

Transfer Learning Models

In the context of health misinformation detection, transfer learning and domain adaptation techniques offer significant advantages. They enable models to generalize across platforms and adapt to new and evolving domains. These techniques are especially useful for combating health-related misinformation, which can emerge quickly and vary across regions, languages, and platforms. By leveraging knowledge from source domains and applying it to target domains with limited labeled data, transfer learning models can efficiently address domain shifts, improve detection accuracy, and make systems more robust. Below, we review key studies that have successfully applied these methods to health misinformation detection.

Misinformation detection models need to be both generalizable across different platforms and adaptable to new domains. Joshi et al developed a universal misinformation classifier across multiple platforms using Domain-Adversarial Neural Networks (DANN). They incorporated the Local Interpretable Model-Agnostic Explanations (LIME) framework for explainability. Their study, using COVID-19 misinformation as a case study, demonstrated that DANN improved the F1 score, increasing classification accuracy by 3% and AUC by 9%.⁶⁸ Yue et al tackled the domain adaptation problem in early-stage COVID-19 misinformation detection with a Contrastive Adaptation Network for Misinformation Detection (CANMD). The model used pseudo-labeling, label correction, and contrastive adaptation loss to improve adaptation between source and target domains. Experiments with RoBERTa on multiple datasets showed that CANMD achieved an average improvement of 11.5% and up to 34.5% under label shift conditions.¹⁴⁷ Yue et al addressed cross-domain misinformation detection in low-resource settings with MetaAdapt, a meta-learning-based few-shot detection model. By leveraging Task Similarity Computation and Gradient Rescaling, MetaAdapt facilitated knowledge transfer between source and target domains. Experiments on the CoAID, Constraint, and ANTiVax datasets confirmed its superiority over state-of-the-art baselines and large language models, making it highly effective for misinformation detection in emerging domains.¹⁴⁸ Dhankar et al applied transfer learning to detect COVID-19 misinformation on social media. Their approach combined General Twitter Embedding (GTE) with a Context-Specific Embedding (CSE) to enhance text classification. They used SVM and MLP as classifiers. Two strategies were explored: Augmented Transfer Learning (ATE) and Concatenation Transfer Learning (CTL). The results showed that concatenating GTE and CSE significantly improved performance.¹⁴⁹

As misinformation spreads across languages, cross-lingual detection methods have become crucial. Du et al introduced CrossFake, a model designed to detect low-resource language (Chinese) COVID-19 fake news using BERT-based embeddings trained on high-resource language (English) data. For Chinese news, they used Google Translate to convert content into English before applying the same processing pipeline. Comparative experiments showed that CrossFake outperformed multiple monolingual and cross-lingual baselines, validating its effectiveness for cross-lingual misinformation detection from high-resource to low-resource languages.⁸¹ Ghayoomi and Mousavian utilized XLM-RoBERTa and a parallel CNN to detect Persian COVID-19 fake news. They applied cross-lingual and cross-domain knowledge transfer by incorporating an English COVID-19 dataset (for cross-lingual transfer) and a general domain Persian fake news dataset (for cross-domain transfer). This combined approach improved performance, achieving an F1 score of 0.721, surpassing the baseline by 2.39%.⁸²

Overall, transfer learning models provide a scalable solution for unimodal health misinformation detection through domain alignment, knowledge integration, and cross-lingual transfer mechanisms. Studies have shown that adversarial training and contrastive adaptation effectively mitigate domain shift, while meta-learning frameworks maintain detection robustness in low-resource scenarios by leveraging task similarity metrics. Additionally, cross-lingual transfer techniques overcome language barriers through embedding alignment and translation-based augmentation. Despite demonstrating exceptional performance on specific datasets, current methods still face challenges related to computational cost, dependence on pretrained corpora, and coverage of low-resource languages. Future research should explore lightweight transfer architectures, automated domain adaptation thresholds, and language-agnostic universal embedding representations to address the complex challenges posed by the global spread of health misinformation.

Hybrid Models

Detecting health misinformation is a complex task. Traditional single-model approaches often exhibit limitations in handling complex semantics, contextual dependencies, and large-scale textual data. To address these challenges, researchers have increasingly explored hybrid models that integrate ML and DL, as well as various DL methods, to leverage the strengths of different algorithms in order to improve robustness and accuracy. These models not only capture local textual features but also model long-range dependencies and deep semantic information, demonstrating significant potential in single-modal misinformation detection. This section systematically reviews hybrid-model-based single-modal detection methods, with a focus on innovations in architectural design, feature extraction, and task optimization. Additionally, it summarizes key technological advancements and existing limitations in current research.

Hybrid models combining ML and DL have shown great potential in detecting health misinformation. Bonet-Jover et al proposed an architecture for automatically detecting fake news in health domains. This model combines LSTM-Convolutional networks with LR, achieving a macro F1 score of 0.978 based on the 5W1H components.¹⁵⁰ Biradar et al developed several classifiers, including an early fusion-based DNN model, an ensemble of RNNs, a voting classifier architecture, and a multi-level bit-wise OR model. The study found that state-of-the-art language models and ensemble approaches outperformed other machine learning techniques in detecting fake news in COVID-19 datasets.¹⁵¹ Serrano et al detected COVID-19 misinformation on YouTube by analyzing user comments. They used pre-trained Transformer models to identify conspiracy-related comments. The proportion of such comments, combined with TF-IDF features from video titles, was used to train ML classifiers. Results showed that using both comment content and conspiracy ratios achieved 89.4% accuracy, highlighting the value of user comments in misinformation detection.¹⁵² Zhang et al proposed a health misinformation detection method combining stance detection (SDM) and trust models (TM). SDM, based on the T5 language model, generates support, and opposition scores using predefined keywords and structured input. TM, based on LR, predicts the correct health treatment answer using document stance scores and feature vectors. This method achieved 76% accuracy on TREC 2021 health-related questions.¹⁵³

Hybrid models based on multi-task learning and attention mechanisms further improve health misinformation detection performance. Kumari et al proposed a multi-task learning framework that combines a BERT-based model with an attention-enhanced BiLSTM encoder. The architecture jointly performs fake news detection (main task), along with emotion recognition and novelty detection (auxiliary tasks). On the COVID-Stance dataset, the model achieved state-of-the-art performance, with a 7.95% improvement in accuracy.¹⁵⁴ Ma et al proposed a dynamic word embedding method for Chinese text (DWtext), based on a dual-channel CNN model with parallel max-pooling and attention-pooling layers. The model was evaluated on two COVID-19 fake news datasets. Results show that it performs well on noisy data and effectively balances local and global feature relevance.¹⁵⁵ Upadhyay et al introduced the Vec4Cred model, which combines BiLSTM-CNN and attention mechanisms for binary classification via fully connected layers. It performs excellently across multiple datasets and in assessing the authenticity of webpage health information.¹⁵⁶ Fernández-Pichel et al proposed a multi-stage retrieval system for detecting COVID-19 misinformation, based on the TREC 2020 Health Misinformation Track. The system integrates document relevance (BM25), passage relevance (MonoT5), and passage reliability (supervised and unsupervised). It compares unsupervised fusion with learning-to-rank. Results show that passage relevance and unsupervised reliability estimation are effective, and simple fusion methods outperform complex ones.¹⁵⁷

Hybrid approaches utilizing multi-aspect features and diverse feature extraction techniques have demonstrated strong performance in health misinformation detection. Al-Sarem et al proposed a hybrid deep learning model (LSTM-PCNN) for detecting COVID-19-related rumors on social media. The model combines LSTM networks with a Parallel CNN architecture. The study also examined the impact of static word embeddings such as Word2Vec, GloVe, and FastText on model performance. Experimental results showed that the proposed model outperformed other methods in terms of accuracy, precision, recall, and F1 score.¹⁵⁸ Wani et al developed a stance classification model for COVID-19-related tweets using BERT embeddings and a CNN-based classifier. The study compared 4 learning techniques—SVM, RF, DNN, and CNN. Among them, the BERT-CNN model achieved the best performance, with an average F1 score of 68.24%, highlighting the effectiveness of combining contextual embeddings with neural architectures.¹⁵⁹ Li introduced a multidimensional feature framework for health misinformation detection, grounded in Comprehensive Information Theory. The framework integrates syntactic and semantic features, including a novel method called LinguisticStrategy2Vec to extract deep semantic information. DL models such as DNN, LSTM, and CNN were employed for training and classification. Experiments on a real-world dataset showed that the proposed method achieved an F1 score of 0.831 and an AUC of 0.883.⁶⁵ Apostol et al proposed the FN-BERT-TFIDF model, which combines TF-IDF vectors, BERT embeddings, Bi-LSTM, and CNN for misinformation detection. The model achieved accuracy rates of 60.92% on the LIAR dataset and 87.92% on a COVID-19 dataset.¹⁶⁰

In conclusion, hybrid models have proven to be highly effective in detecting health misinformation by leveraging the strengths of multiple algorithms, overcoming the limitations of traditional single-model approaches. These models not only enhance the ability to capture complex patterns within the data but also improve robustness and accuracy through the integration of diverse features and processing methods. Despite significant progress, challenges such as data noise, model interpretability, and scalability remain to be explored further. Additionally, issues such as large-scale corpus storage for training, high computational complexity of models, and real-time processing capabilities still need to be addressed. Future research could focus on optimizing hybrid model architectures, developing more efficient feature extraction techniques, exploring interpretable deep learning models, and tackling bottlenecks in processing large-scale datasets. Furthermore, the integration of multimodal data and the development of real-time misinformation detection systems may become key directions for improving detection accuracy and broadening applicability. As hybrid model technologies continue to evolve, more breakthroughs in health misinformation detection are expected in terms of both accuracy and efficiency.

Multimodal Misinformation Detection

With the rise of social media, the nature of misinformation has evolved from text-based modality to other modalities, such as images, audio, and video. Therefore, the identification of media-rich misinformation requires an approach that exploits and effectively combines the information acquired from different multimodal categories.¹⁶¹ Data fusion is the process of combining information from multiple modalities to take advantage of all different aspects of the data and extract as much information as possible to improve the performance of machine learning models, as opposed to using a single data aspect or modality.¹⁶² Fusion mechanisms are commonly categorized into 3 types: early fusion, intermediate fusion, and late fusion.¹⁶³ Illustrations of the 3 fusion mechanisms are presented in Figure 5.

Figure 5.

Illustrations of early, intermediate, and late fusion mechanisms.

Early Fusion

Early fusion refers to the process of integrating raw or preprocessed data from multiple modalities before feeding them into the model.¹⁶³ Hou et al explored automatic misinformation detection in online medical videos. They combined language, acoustic, and user engagement features, using an LSVM classifier. Their multimodal approach achieved an overall accuracy of 74%, outperforming single-modality methods.¹⁶⁴

Intermediate Fusion

Intermediate fusion refers to combining features extracted from different modalities before passing them to the model for decision-making.¹⁶³ Intermediate fusion enhances detection accuracy and interpretability by integrating multimodal features at the intermediate layers of the model, leveraging attention mechanisms and cross-modal interactions to address semantic consistency challenges. Wang et al introduced semantic and task-level attention mechanisms to focus on key content in antivaccine messages. Their 3-branch model, with distinct attention mechanisms for each branch, combined 3 unimodal features into a 4-dimensional feature set and used an SVM model with an RBF kernel for classification. The model achieved an accuracy of 0.966, while the ensemble method reached 0.974.¹⁶⁵ González-Fernández et al utilized M-BERT to assess the reliability of websites based on textual information. They then employed Gaussian processes to integrate the output of the M-BERT classifier with visual design features to generate the final reliability estimation. By combining textual and visual features, their approach achieved an accuracy of 78% in classifying health-related websites.¹⁶⁶ Hua et al proposed a multimodal fake news detection method called TTEC. It uses back-translation for text data augmentation to enhance the model’s ability to learn topic-related features. BERT and ResNet-50 are employed to extract features from news text and entire images, respectively. A multi-head self-attention mechanism is applied to fuse multimodal information. Additionally, contrastive learning is introduced to address the limitations of weak inter-modality interaction and small data size. Experimental results on a COVID-19 news dataset demonstrate that TTEC outperforms existing methods.¹⁶⁷

Intermediate fusion techniques have also shown significant effectiveness in video-based health misinformation detection. Shang et al developed the TikTec multimodal learning framework for detecting misleading COVID-19 videos on TikTok. After extracting visual and speech features, a co-attention mechanism modeled the correlation between video frames and speech content. The fused multimodal information was encoded by a Gated Recurrent Unit (GRU), passed through an MLP layer, and classified with a softmax layer. TikTec outperformed the best baseline, achieving a 6.1% improvement in accuracy and a 4.8% increase in F1 score.⁷⁵ Shang et al proposed the DGExplain framework to address limitations in existing multimodal fake news detection methods during the COVID-19 pandemic. It integrates a range of mechanisms, including an Object-aware Multimodal Feature Encoder (OMFE), a Text-guided Visual Feature Generator (TVFG), and an Image-guided Text Feature Decoder (ITFD), to address cross-modal consistency issues. The Comment-driven Explanation Generator (CDEG) builds a content-comment graph to integrate multimodal information and user comments for fake news detection and explanation generation. The method outperforms existing approaches in accuracy, precision, recall, and F1 score.¹⁶⁸ Shang et al introduced the MultiTec framework for healthcare misinformation detection on TikTok. It combines subtitle-guided visual representation learning with acoustic-aware speech representation. The fusion of multimodal information is enhanced using a dual-attention aggregation network, outperforming baseline methods in accuracy, F1 score, and Kappa coefficient.¹⁶⁹

Late Fusion

Late fusion, also known as “decision fusion,” involves combining the individual decisions derived from each modality to generate the final prediction. This can be achieved through methods such as majority voting, weighted averaging, or employing a meta-machine learning model to aggregate the individual outputs.¹⁶³ Raj and Meel proposed the Allied Recurrent and Convolutional Neural Network (ARCNN) framework to distinguish fake and real COVID-19 news. It combines RNN and CNN for final predictions. The RNN component uses LSTM and Bi-LSTM for text classification, while the CNN component uses fine-tuned VGG-16, MobileNetV2, InceptionV3, and XceptionNet for image classification. Early fusion and 4 late fusion techniques are applied to combine the 2 modalities. Experiments on 6 COVID-19 fake news datasets show that weighted averaging is the best fusion method. Bi-LSTM outperforms LSTM, and XceptionNet and MobileNetV2 are the top choices for image classification.¹⁷⁰

In summary, multimodal learning approaches provide multidimensional solutions for health misinformation detection by integrating heterogeneous information from text, visual, audio, and user engagement data. Early fusion enhances global feature representation through front-end data integration, intermediate fusion improves semantic consistency modeling via attention mechanisms and cross-modal interactions, while late fusion leverages decision-level ensemble strategies to combine modality-specific advantages. Emerging trends reveal that contrastive learning-based feature alignment, bidirectional cross-modal interaction through attention mechanisms, and joint modeling of user comments with multimodal content have become pivotal techniques for performance enhancement. Although current studies demonstrate significant progress in public health crises like COVID-19, challenges persist regarding multimodal data scarcity, cross-modal semantic gaps, and adaptation to dynamic social media content. Future research should focus on self-supervised pretraining, interpretable fusion mechanisms, and real-time multimodal streaming processing to address the evolving ecosystem of health misinformation dissemination.

Other Advanced Detection Methods

Knowledge Graph-Based Methods

The modern version of the term “knowledge graph” originated with the release of the Google Knowledge Graph in 2012. It uses a graph-based data model to capture knowledge in application scenarios that involve integrating, managing, and extracting value from disparate data sources at scale.¹⁷¹ Knowledge graphs are usually composed of 3 basic elements: entities, relationships and triples. Entities represent real-world objects, such as people, places, or concepts, while relationships define the connection between 2 entities.⁶⁴ The general framework for health misinformation detection based on knowledge graphs is illustrated in Figure 6.

Figure 6.

General framework of knowledge graph-based health misinformation detection.

Knowledge graphs have been increasingly applied in health misinformation detection frameworks to enhance both accuracy and interpretability. Cui et al proposed the DETERRENT model to address misinformation detection in the medical domain. The model leverages a medical knowledge graph (KG) and an article-entity bipartite graph to detect medical misinformation and provide explanations. It employs an information propagation net, a knowledge-aware attention mechanism, and a prediction layer. In terms of F1 scores, DETERRENT outperforms all other methods by at least 4.78% on the diabetes dataset and at least 12.79% on the cancer dataset.¹⁷² Shang et al proposed the MMAdapt framework for early health misinformation detection. It constructs a domain-specific knowledge graph for the source domain, extracts knowledge triples, and learns node representations from the source articles. An uncertainty-aware knowledge validation mechanism filters and validates high-uncertainty triples. The framework also incorporates a knowledge-post dual adaptation process, including domain-level and instance-level adaptation, to learn generic representations. This enables effective detection of health misinformation in the target domain.¹⁷³ Liu et al proposed the KG2S method. This method combines knowledge graphs with a 2-stage approach: Knowledge Breadth Retrieval (KBR) and Knowledge Depth Reasoning (KDR). In the KBR stage, a dual filter of similarity and diversity simulates human heuristic behavior using the knowledge graph’s rich facts. In the KDR stage, a hierarchical attention network mimics human coarse-to-fine knowledge analysis in decision-making. This model demonstrates excellent accuracy in detecting health misinformation.⁶⁴ Lara-Navarra et al proposed another innovative method for detecting misinformation in healthcare. They constructed an dynamic knowledge graph using topic-specific information collected from the internet. By analyzing factors such as tweet propagation paths, user interactions (eg, retweets), and user roles in the dissemination process, they identified clues to assess the truthfulness of information.¹⁷⁴

Knowledge graphs have also been widely adopted for identifying COVID-19-related misinformation, achieving notable improvements in accuracy and explanation quality. Weinzierl and Harabagiu proposed COVAXLIES, a COVID-19 tweet dataset, to construct a knowledge graph. They framed the detection as a graph link prediction problem. Experimental results show that this method outperforms neural classification, achieving up to a 10% improvement in F1 score.¹⁷⁵ Koloski et al proposed a novel document representation learning method based entirely on knowledge graphs, using an extensive set of subject-predicate-object triples. Tested on datasets like COVID-19 Fake News detection, the results show that when combined with existing context representations, the knowledge graph-based document representation achieves state-of-the-art performance.¹⁷⁶ Kou et al proposed the HC-COVID scheme, which models knowledge facts through collaboration between experts and non-experts to build crowdsourced hierarchical knowledge graphs. A binary hierarchical attention graph neural network is used to integrate this knowledge for detecting and interpreting COVID-19 misinformation. Evaluation results show that it significantly outperforms existing baselines in both detection accuracy and interpretability.¹⁷⁷

Knowledge graph-based health misinformation detection enhances both accuracy and interpretability through structured semantic associations and logical reasoning. Existing research primarily leverages the entity-relation modeling capabilities of knowledge graphs, constructing domain-specific knowledge bases from sources such as medical literature and social media data. These knowledge graphs are then integrated with techniques such as graph neural networks and attention mechanisms to enable deep reasoning. However, challenges remain, including delayed real-time updates, low efficiency in fusing multi-source heterogeneous data, and high computational complexity. Future research should explore dynamic knowledge evolution mechanisms, multimodal knowledge graph construction, and lightweight inference architectures. Additionally, incorporating causal analysis techniques could facilitate deeper exploration of misinformation propagation chains, contributing to a more efficient, transparent, and adaptable health information governance system.

Fact-Checking-Based Methods

Fact-checking assesses the authenticity of knowledge based on known facts. There are 2 approaches to fact-checking: manual or traditional fact-checking and automated fact-checking. The manual fact-checking process involves a group of domain experts verifying the content of articles or relying on expert-based fact-checking websites. Automated fact-checking techniques depend on information retrieval, natural language processing, and machine learning techniques to develop fact-checking models.⁶³ Fact-checking-based methods play a crucial role in detecting misinformation, particularly in the healthcare domain. The general framework for health misinformation detection based on manual and automated fact-checking methods is illustrated in Figures 7 and 8.

Figure 7.

General framework of health misinformation detection based on manual fact-checking methods.

Figure 8.

General framework of health misinformation detection based on automated fact-checking methods.

Manual fact-checking emphasizes human-driven verification, prioritizing nuanced interpretation and evidence transparency. Roitero et al recruited crowdworkers through the Amazon Mechanical Turk (MTurk) platform to assess the truthfulness of COVID-19 statements. A six-point truthfulness scale was used, ranging from “pants-on-fire” (completely false) to “true” (completely true). To improve assessment quality, workers were required to provide evidence supporting their judgments, such as reference URLs and text explanations. The study found that crowdworkers could accurately assess the truthfulness of COVID-19 statements, especially when using mean aggregation, which aligned closely with expert judgments. However, workers struggled to distinguish between the most extreme categories, such as “pants-on-fire” and “false.”¹⁷⁸ Kaufman et al examined how group characteristics affect the accuracy of COVID-19 misinformation detection, focusing on differences between crowdsourced workers and university students. The study found that crowdsourced workers can effectively detect COVID-19 misinformation and sentiment. However, their performance is significantly influenced by cognitive style, attitude bias, and platform choice.¹⁷⁹

Automated fact-checking techniques based on similarity measures and feature engineering have achieved competitive performance in health misinformation detection. Barve and Saini proposed the CSM algorithm, which introduces novel features such as the “Content Similarity Score.” Their approach attained an accuracy of 91.06%, outperforming conventional similarity measures like Jaccard similarity.⁶³ Hossain et al formulated COVID-19 misinformation detection as a 2-step task: misconception retrieval and stance detection. For retrieval, they evaluated several semantic similarity and information retrieval approaches, finding that domain-adapted BERTSCORE—using COVID-Twitter-BERT—achieved the best performance, with Hits@1 improving from 38.7% to 61.3%. For stance detection, they reframed the task as natural language inference (NLI), mapping Agree, Disagree, and No Stance to Entailment, Contradiction, and Neutral, respectively. Combining domain-adapted retrieval with NLI classifiers yielded notable gains, with the best macro F1 score reaching 50.2%.⁸⁶

Automated fact-checking techniques leveraging pre-trained language models have demonstrated strong capabilities in detecting complex semantic misinformation, particularly in the healthcare domain. Nakov et al applied fact-checking techniques to detect COVID-19 misinformation. They used models like BERT and technologies such as WordNet to assess various attributes of tweets and news statements based on different tasks. Specific models were employed to verify and rank the claims, and evaluation metrics were used to optimize the detection results.¹⁸⁰ Sarrouti et al constructed the COVID-19 fact-checking dataset, HEALTHVER. They trained pre-trained models such as BERT, SciBERT, BioBERT, and T5 on claim-evidence pairs, minimizing cross-entropy loss and treating it as a multiclass classification problem. Experimental results show that models trained on the HEALTHVER dataset outperform those trained on synthetic or open-domain claims. The T5 model achieved the best performance, with an accuracy of 80.69%.⁹⁰

Automated fact-checking techniques integrating multimodal inputs and external evidence sources have enhanced the accuracy and robustness of misinformation detection systems. Martinez-Rico et al utilized MetaMap to extract unique medical concept identifiers (CUIs). A Transformer model was used for claim verification classification, integrating multiple inputs such as SPO (subject-predicate-object) triples and text. The Transformer-feedforward neural network (FFNN) ensemble model was trained for fact-checking classification, and the results were used to classify news items as true or false based on sentence-level verification.⁵ Hammouchi and Ghogho proposed a multilingual automatic fact-checking method that determines the veracity of news by retrieving external evidence related to a claim and assessing the credibility of its sources. The method uses multilingual BERT for semantic representations and LSTM for context modeling. It integrates evidence with source credibility scores and feeds them into a classifier for truthfulness judgment. This framework performs excellently across multiple datasets and is well-suited for health-related misinformation in multilingual environments.¹⁸¹

Automated fact-checking techniques incorporating open-source intelligence systems and human-AI collaboration frameworks have provided explainable and high-performing solutions for health misinformation detection. Martinez Monterrubio et al proposed a health misinformation detection method based on open-source intelligence (OSINT) and case-based reasoning (CBR). They developed the MedOSINT tool, using official COVID-19 health bulletins as a fact-checking database. The method involves collecting official data, extracting key information from social media news, matching relevant content in the OSINT database, and using CBR for case retrieval to provide explainable judgments. Experimental results show that MedOSINT achieves 93.33% accuracy in detecting COVID-19-related fake news, outperforming Google Search and other existing tools.¹⁸² Kou et al proposed CEA-COVID, a human-AI collaborative framework for COVID-19 misinformation detection. It integrates crowdsourced workers, medical experts, and AI to assess information credibility while providing natural language explanations. The framework includes crowdsourced workers identifying logical inconsistencies, AI verifying medical facts using a knowledge graph, and experts updating the knowledge base. A Transformer model generates explanations. Experiments on the COVIDRumor and CONSTRAINT datasets show that CEA-COVID achieves F1 scores of 91.1% and 95.0%, outperforming BertCOVID, HC-COVID, and other existing methods.¹⁸³

In summary, fact-checking-based methods play a vital role in detecting health misinformation. Both manual and automated fact-checking approaches have their strengths and can complement each other. Manual fact-checking ensures nuanced interpretation and transparent evidence, while automated techniques improve detection speed and scalability. With advances in natural language processing, pre-trained language models, multi-task learning, and knowledge graphs, automated fact-checking has achieved significant progress, especially in identifying COVID-19-related misinformation. Additionally, human-AI collaborative frameworks and multilingual fact-checking methods have introduced new solutions for verifying health information across different languages and cultural contexts. In the future, integrating human expertise, knowledge-based resources, and intelligent algorithms to develop efficient, explainable, and scalable health misinformation detection systems will be a key direction for research.

Large Language Models

Large language models (LLMs), a subset of AI, are advanced computational models excelling in language generation and understanding.¹⁸⁴ Recent studies show that LLMs can effectively provide factual answers to clinical questions and accurately identify health-related misconceptions and misinformation.^185,186 However, the rise of publicly accessible AI chatbots, capable of generating large volumes of persuasive human-like text, may accelerate misinformation spread, leading to an “AI-driven infodemic.”¹⁸⁷ The roles of large language models in health misinformation are illustrated in Figure 9.

Figure 9.

Roles of large language models in health misinformation.

Zhou et al distilled human-created COVID-19 misinformation into narrative prompts and used GPT-3 to generate AI-generated counterparts, forming a comparative dataset. Analysis revealed that AI-generated misinformation differs linguistically, featuring enhanced detail and uncertainty. While existing detection models could identify AI misinformation, their performance declined. Moreover, current evaluation guidelines proved less effective, as AI-generated content often meets credibility and transparency criteria.¹⁸⁸ Garbarino and Bragazzi evaluated the effectiveness of Google Bard and ChatGPT-4 in identifying and understanding sleep health misinformation. Statistical and text analysis revealed that Bard identified misinformation with 95% accuracy, while ChatGPT-4 achieved 85%, with no significant difference. However, their rating distributions and alignment with expert assessments varied. Text analysis showed Bard emphasized practical advice, while ChatGPT-4 focused on theoretical aspects, and readability analysis indicated Bard’s responses were more accessible.¹⁸⁹ Choi and Ferrara proposed the FACT-GPT framework, which leverages LLMs to automate the claim matching task in fact-checking—identifying whether new content on social media supports or refutes previously fact-checked misinformation. The researchers constructed a matching dataset and tested several pre-trained LLMs (eg, GPT-4, Llama-2-70b), fine-tuning smaller LLMs for optimization. The results showed that LLMs perform near-human accuracy in claim matching, and fine-tuned smaller models achieve comparable performance to larger models at lower computational cost, demonstrating their potential for fact-checking applications.¹⁹⁰

Although several scholars have explored the use of artificial intelligence (AI) in generating and detecting health misinformation, many have overlooked the associated ethical concerns. These concerns include data ownership, informed consent for data use, bias and representativeness in training data, and privacy protection.¹⁹¹ The absence of clear legal frameworks for data collection has raised serious ethical questions. In particular, there is growing concern about the use of personal data without consent or proper regulation.¹⁸⁷ Even without malicious prompts, large-scale generative models can produce inappropriate, biased, or false content. This is often due to inaccuracies or biases in the training datasets.¹⁹² Such biases can further widen disparities in mental health services and reinforce existing social inequalities.To ensure their responsible use, future research must integrate technical innovation with ethical governance, ensuring that AI systems are both effective and socially accountable.

Discussion

Main Findings

This study provides a systematic review of health misinformation, covering its conceptual foundations, dissemination mechanisms, psychological impacts, susceptibility, available datasets, evaluation metrics, machine learning, and deep learning detection methods, as well as other advanced detection approaches. Based on this comprehensive analysis, the following key conclusions are drawn:

Feature Variability

Health misinformation is highly sensitive to the intentions of its disseminators and the context in which it spreads. It manifests in diverse forms and exhibits inherent relativity. Its propagation follows extended pathways, spreads rapidly, and demonstrates strong local clustering effects.

Dataset Diversity and Metric Reliability

Health misinformation datasets are highly heterogeneous in terms of data sources, with research priorities structured around specific domains. COVID-19 datasets dominate binary classification studies, while vaccine-related datasets continue to receive sustained attention. English remains the primary language, but multilingual datasets have emerged in recent years. There is an increasing demand for fine-grained classification of complex misinformation types. Conventional evaluation metrics may be inadequate for imbalanced health misinformation datasets, and that metrics such as MCC, Cohen’s Kappa, and Brier Loss offer more reliable performance assessment in such cases.

Algorithmic Synergy and Hybrid Optimization

Traditional ML algorithms remain competitive in health misinformation detection when robust feature engineering and computational efficiency are prioritized. Ensemble methods and evolutionary algorithms enhance performance by integrating the strengths of different models and optimizing feature selection. However, neural networks excel at capturing high-dimensional semantic patterns, despite their higher computational cost. This suggests that hybrid models may be the optimal strategy for balancing accuracy, interpretability, and resource constraints.

Deep Learning Complementarity and Multimodal Integration

DL methods exhibit complementary strengths in health misinformation detection, with traditional, attention-based, graph-based, transfer learning, and hybrid models each addressing distinct challenges. Multimodal approaches further enhance detection by integrating heterogeneous data, though issues like data scarcity, semantic alignment, and interpretability remain.

Other Advanced Method Efficacy and Emerging Risks

Other advanced detection methods—including knowledge graphs, fact-checking frameworks, and large language models—significantly enhance the accuracy, interpretability, and adaptability of health misinformation detection. However, challenges remain in computational efficiency, cross-cultural generalization, and mitigating AI-generated misinformation, highlighting the need for explainable, scalable, and human-AI collaborative systems.

Limitations

Despite significant progress in the field of health misinformation detection, current systems continue to exhibit several critical limitations that hinder their effectiveness and scalability. A detailed overview of these challenges is presented below.

Data-Related Limitations

Current research on health misinformation detection faces several data-related challenges. Many datasets remain small in scale,^{81,101,103,113,151} cover only short time spans,^103,106 and often suffer from class imbalance.^66,96,111 They are also restricted to specific diseases or single platforms, which limits model generalizability.^72,93 In addition, most datasets lack social network-related metadata, such as user profiles, propagation pathways, and interaction histories. Such information is crucial for tracing the origins and dynamics of misinformation.¹⁰⁸ Annotation further introduces difficulties. Human labels are prone to subjectivity and inconsistency,^65,93,96,152 require high labor costs,⁸⁴ and lack standardized terminology across corpora,^33,120 which complicates model training and evaluation. Another major gap lies in modality and language coverage. Existing datasets focus mainly on text, while images, audio, and video receive little attention.^65,76 Low-resource languages are also underexplored, constraining cross-lingual applicability.^103,151 Finally, misinformation evolves rapidly with public health events. Models trained on outdated datasets often fail to detect emerging narratives, leading to performance degradation.^17,126,139

Methodological Limitations

Methodological limitations also constrain progress in this field. Many approaches still rely on handcrafted or traditional statistical features, with limited use of semantic, emotional, temporal, or interaction-based representations.^{76,93,102,103,111} Feature integration and optimization remain insufficiently studied.^4,62,110 Another key challenge lies in balancing model complexity and interpretability. Traditional machine learning methods often fail to capture complex patterns,¹⁰² while deep learning models, though more effective, require heavy computation and lack transparency.^113,133,136 Limited attention has also been paid to optimization strategies such as hyperparameter tuning, which may affect model stability and performance.¹¹³ Furthermore, models are typically designed for specific events or single platforms, with poor cross-domain and cross-platform generalization.^62,103 Finally, research places excessive emphasis on computational performance while giving insufficient attention to human-centered design and interdisciplinary collaboration. This weakens its relevance to clinical and public health contexts and may overlook the needs of vulnerable populations, thereby limiting the social value of research outcomes.

Evaluation and Application Limitations

Evaluation and application challenges further restrict real-world impact. The lack of unified evaluation frameworks makes it difficult to compare results across studies, as metrics, baselines, and experimental setups vary widely.¹⁰⁵ In addition, most research does not reflect the dynamic evolution and dissemination of misinformation in real environments, reducing robustness in complex social media ecosystems.¹⁰⁵ Early and real-time detection also remains underdeveloped.^{17,73,121,138,170} Finally, research often prioritizes algorithmic performance over practical concerns such as scalability, deployability, and user needs. This gap reduces the likelihood that detection systems will be adopted by public health institutions.

Future Research Directions

To address the current limitations and further advance the field of health misinformation detection, several key research directions can be pursued. By focusing on these future research directions, the field of health misinformation detection can make significant strides toward building more effective and scalable systems to combat misinformation in the digital age.

Enhancing Data Quality and Diversity

Future research should focus on building larger and more diverse datasets that span wider temporal ranges and encompass multiple diseases, platforms, and languages.^115,151 Incorporating multimodal sources such as images, audio, and video can help capture the complexity of health misinformation more effectively.^{76,93,101,103,151} To better understand dissemination mechanisms, datasets should also include contextual metadata, such as author background, source credibility, and propagation patterns.¹⁰⁸ Improving annotation practices is equally important. Unified standards¹²⁰ and semi-automated labeling methods that combine human expertise with machine support can reduce subjectivity,⁸⁴ lower costs, and improve comparability across datasets.³³ Finally, continuous data collection and timely updates are needed to ensure that models remain responsive to evolving narratives during public health crises.^115,121,138

Methodological and Model Innovation

Advancing methodological approaches is crucial for the progress of health misinformation detection. Future models should move beyond shallow representations by incorporating richer semantic, emotional, temporal, and interaction-based features,^72,76 while also leveraging symbolic cues such as emojis and numerals that are common in social media.^115,137 Studying the alignment between different feature types and classifier architectures¹²⁰ will help optimize feature–model combinations. Integrating medical knowledge and utilizing tools such as knowledge graphs and fact-checking systems will also reduce false positives and strengthen model reliability.^{4,61,101
-103} In parallel, progress in multimodal fusion is needed, combining textual, visual, and auditory information^152,154 while drawing on advances from computer vision for image and video analysis.¹⁹³ Transfer learning and domain adaptation techniques can improve robustness and enable generalization across platforms, events, and languages, thereby avoiding overfitting to narrow contexts.^{4,61,62,67,73,88,103,113,131,137} At the same time, models must balance interpretability and efficiency through visualization tools, lightweight embeddings, and resource-efficient compression methods that support real-world deployment.^{33,73,102,113,129,157} Systematic exploration of hyperparameter tuning and optimization strategies is also necessary to ensure stable performance.^113,134 Finally, learning under limited supervision, including weakly supervised, semi-supervised, and self-supervised approaches, offers promising directions to address challenges such as annotation scarcity, class imbalance, and distribution shifts.^88,96

Improving Evaluation and Real-World Applications

Improving evaluation and application practices is essential for bridging the gap between research and deployment. Establishing standardized evaluation frameworks and benchmark datasets will allow more consistent and transparent comparison of detection methods.¹⁰⁵ Future systems should also prioritize early detection by leveraging user interaction signals, social network structures, and multi-source metadata,^61,73,96 while exploring zero-shot learning to identify emerging misinformation in a timely manner.¹⁷⁵ To maximize societal impact, detection technologies must be integrated with policy and education. For example, aligning systems with public health policies and crisis management strategies can support effective interventions,⁷² while educational approaches such as new media teaching can strengthen public health literacy and reduce the spread of misinformation.¹⁹⁴ In addition, adoption and sustainability require attention to social and institutional factors. Theoretical perspectives such as the Technology Acceptance Model and the Resource-Based View can provide insights into user acceptance and long-term viability.¹⁹⁵ Ultimately, interdisciplinary and human-centered approaches will be critical. Collaboration among AI researchers, medical experts, public health professionals, and policymakers can ensure ethical deployment,^25,196 while integration with social media data can help mental health professionals monitor psychosocial impacts, track public discourse, and design targeted interventions for vulnerable groups.¹⁹⁷

Conclusion

In this study, a comprehensive and systematic literature review on health misinformation detection was conducted to provide a thorough understanding of the concept of health misinformation, its dissemination mechanism, psychological impact, susceptibility, existing datasets, and evaluation metrics. By systematically retrieving and analyzing 100 high-quality publications from 2016 to 2025, a wide range of detection approaches were reviewed, including traditional machine learning algorithms, deep learning models, knowledge graph-based frameworks, fact-checking strategies, and the latest large language models. Their respective strengths and limitations in practical applications were compared, and future research directions were proposed. This review not only provides a clear overview of the current research landscape but also emphasizes the importance of interdisciplinary collaboration, human-centered design, and ethical considerations in developing effective and clinically relevant detection systems.

In future work, we aim to expand our literature retrieval by including multiple academic databases beyond Google Scholar. This will ensure more comprehensive coverage of relevant studies. We also plan to refine our keyword strategies to capture a broader range of research topics and methodologies. To minimize biases and enhance the rigor of the review process, we will implement standardized inclusion and exclusion criteria and involve multiple reviewers in the screening process. Additionally, we seek to incorporate multilingual literature to broaden the diversity of perspectives. We will also attempt to use interdisciplinary approaches and integrate both qualitative and quantitative methods to provide a more comprehensive and in-depth understanding of health misinformation detection.

Footnotes

Appendix

Appendix 4.

Deep learning algorithms.

Full name	Abbreviation	Basic explanation	Articles
Multi-Layer Perceptron	MLP	MLP consists of input, hidden and output layers, trained by back propagation algorithms, and is capable of handling complex nonlinear problems.	Sicilia et al,⁸⁵ Bonet-Jover et al,⁸⁴ Madani et al,¹⁰⁹ Bangyal et al,¹¹⁵ Nabożny et al,²³ Dhankar et al¹⁴⁹
Artificial Neural Network	ANN	ANN is a mathematical or computational model that mimics the structure and function of a biological neural network for estimating or approximating functions.	Elhadad et al,¹²¹ Shams et al¹¹²
Convolutional Neural Network	CNN	CNN extract features through structures such as convolutional, pooling, and fully connected layers and perform tasks such as classification or regression, and are particularly well suited for processing data with a grid-like topology, such as images and time-series data.	Du et al,⁶⁶ Li,⁶⁵ Al-Yahya et al,¹¹⁸ Bangyal et al,¹¹⁵ Tomaszewski et al,¹¹⁶ Dharawat et al,⁹⁶ Dai et al,⁶¹ Alhakami et al,¹¹⁰ Kaliyar et al,¹²⁴ Luo et al,⁸⁸ Kumar et al,¹²⁵ Narra et al,¹¹¹ Albalawi et al¹¹⁷
Recurrent Neural Network	RNN	RNNs are suitable for processing and predicting sequence data by capturing temporal dependencies in sequences through cyclic connections and hidden states.	Du et al,⁶⁶ Iwendi et al,¹¹⁴ Al-Yahya et al,¹¹⁸ Bangyal et al¹¹⁵
Long Short-Term Memory	LSTM	LSTM solves the problems of gradient vanishing and gradient explosion that exist in traditional recurrent neural networks when dealing with long sequential data by introducing mechanisms such as memory units and forgetting gates.	Sharma and Garg,¹⁰⁸ Raj and Meel,¹⁷⁰ Li,⁶⁵ Hayawi et al,⁶⁷ Chen et al,¹⁷ Iwendi et al,¹¹⁴ Bangyal et al,¹¹⁵ Dai et al,⁶¹ Alhakami et al,¹¹⁰ Abdelminaam et al,¹¹⁹ Chen and Lai,¹²⁶ Narra et al¹¹¹
Bi-directional LSTM	Bi-LSTM	Bi-LSTM is an improved LSTM model that improves the model’s ability to understand and predict sequence data by considering both forward and backward information of the sequence.	Mahara et al,¹⁰⁷ Sharma and Garg,¹⁰⁸ Raj and Meel,¹⁷⁰ Chen et al,¹⁷ Bangyal et al,¹¹⁵ Tomaszewski et al,¹¹⁶ Dharawat et al,⁹⁶ Kumari et al,¹⁵⁴ Luo et al,⁸⁸ Chen and Lai,¹²⁶ Albalawi et al¹¹⁷
Gate Recurrent Unit	GRU	GRU is a variant of recurrent neural networks designed to solve the problem of gradient vanishing and gradient explosion encountered by traditional RNNs when dealing with long sequential data.	Chen et al,¹⁷ Iwendi et al,¹¹⁴ Al-Yahya et al,¹¹⁸ Bangyal et al,¹¹⁵ Abdelminaam et al,¹¹⁹ Chen and Lai¹²⁶
Visual Geometry Group-16	VGG-16	VGG-16 has 16 weight layers and performs well in tasks such as image classification and target detection.	Sharma and Garg,¹⁰⁸ Raj and Meel¹⁷⁰
Residual Network	ResNet	ResNet is a deep convolutional neural network (CNN) architecture designed to address the vanishing gradient problem in very deep networks.	Narra et al,¹¹¹ Sharma and Garg¹⁰⁸
Bidirectional Encoder Representations from Transformers	Bert	Bert is a deep bi-directional pre-training model based on Transformer, which learns rich language representation capabilities through pre-training on large-scale corpora.	Hayawi et al,⁶⁷ Qasim et al,¹³⁸ Al-Yahya et al,¹¹⁸ Dharawat et al,⁹⁶ Morita et al,¹³¹ Kumari et al,¹⁵⁴ Das et al,¹³⁴ Malla and Alphonse,¹³³ Abd Elaziz et al,¹³⁷ Hossain et al,⁸⁶ Ayoub et al,¹³⁹ Alghamdi et al,¹¹³ Hua et al¹⁶⁷

Acknowledgements

Not applicable.

Author’s Note

Didier El Baz is now affiliated with Harbin Engineering University, Harbin, 150006, China.

ORCID iD

Jia Luo

Ethical Considerations

This work is a literature review and did not involve the collection of primary data from human participants, animals, or identifiable individuals. Therefore, ethics approval was not required.¹⁹⁸

Consent to Participate

This work does not contain any studies with human participants performed by any of the authors.

Author Contributions

Xiaoye Feng collected all the data and wrote the manuscript. Jia Luo proposed the original idea and addressed all the issues in the manuscript. Yang Yang and Didier El Baz revised and polished the final version. Lei Shi created the illustrations.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by Beijing Natural Science Foundation (Grant No. 9242003), the Natural Science Foundation of Chongqing, China (Grant No. CSTB2023NSCQ-MSX0391), the National Natural Science Foundation of China (Grant No. 72104016), Key Laboratory of Public Opinion Governance and Computational Communication (Grant No. YQKFYB202501).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Data will be made available on request.

Guarantor

Not applicable.

References

Talha

Khan

Iqbal

Alghobiri

Iqbal

Fayyaz

Deep learning in news recommender systems: a comprehensive survey, challenges and future trends. Neurocomputing. 2023;562:126881.

Webb

Jirotka

Nuance, societal dynamics, and responsibility in addressing misinformation in the post-truth era: commentary on Lewandowsky, Ecker, and Cook. J Appl Res Mem Cogn. 2017;6:414-417.

Kbaier

Kane

McJury

Kenny

Prevalence of health misinformation on social media-challenges and mitigation before, during, and beyond the covid-19 pandemic: scoping literature review. J Med Internet Res. 2024;26:e38786.

Liu

Qing

Peng

Analysis and detection of health-related misinformation on Chinese social media. IEEE Access. 2019;7:154480-154489.

Martinez-Rico

Araujo

Martinez-Romo

Building a framework for fake news detection in the health domain. PLoS One. 2024;19(7):e0305362.

Silver

Huang

Smartphone, social media users have broader social networks in emerging economies. Pew Research Center, Report. 2019.

Heley

Chou

D’Angelo

, et al. Mitigating health and science misinformation: a scoping review of literature from 2017 to 2022. Health Commun. 2025;40(1):79-89.

Borah

. Politicization of health issues: misinformation and social media. In: Wright

, ed. Communication and Misinformation: Crisis Events in the Age of Social Media. John Wiley & Sons; 2025:77-89.

Zhang

Zhou

Zhu

Have we found a solution for health misinformation? A ten-year systematic review of health misinformation literature 2013–2022. Int J Med Inform. 2024;188:105478.

10.

Peng

Lim

Meng

Persuasive strategies in online health misinformation: a systematic review. Inf Commun Soc. 2023;26(11):2131-2148.

11.

Southwell

Otero Machuca

Cherry

Burnside

Barrett

NJ.

Health misinformation exposure and health disparities: observations and opportunities. Annu Rev Public Health. 2023;44(1):113-130.

12.

Liu

Prevalence and intervention strategies of health misinformation among older adults: a meta-analysis. J Health Psychol. 2025;30(7):1427-1443.

13.

Rocha

de Moura

Desidério

de Oliveira

Lourenço

de Figueiredo Nicolete

LD.

The impact of fake news on social media and its influence on health during the COVID-19 pandemic: a systematic review. J Public Health. 2021;31(7):1-10.

14.

Schlicht

Fernandez

Chulvi

Rosso

Automatic detection of health misinformation: a systematic review. J Ambient Intell Humaniz Comput. 2023;15(3):1-13.

15.

Ravichandran

Keikhosrokiani

Classification of Covid-19 misinformation on social media based on neuro-fuzzy and neural network: a systematic review. Neural Comput Appl. 2023;35(1):699-717.

16.

China Computer Federation. CCF Recommended International Academic Journals Directory - China Computer Federation. 2022. Accessed March 4, 2025. https://www.ccf.org.cn/Academic_Evaluation/By_category/2023-03-08/787209.shtml

17.

Chen

Lai

Lian

JW.

Using deep learning models to detect fake news about COVID-19. ACM Trans Internet Technol. 2023;23(2):1-23.

18.

Petratos

PN.

Misinformation, disinformation, and fake news: cyber risks to business. Bus Horiz. 2021;64(6):763-774.

19.

Wang

McKee

Torbica

Stuckler

Systematic literature review on the spread of health-related misinformation on social media. Soc Sci Med. 2019;240:112552.

20.

Allcott

Gentzkow

Social media and fake news in the 2016 election. J Econ Perspect. 2017;31(2):211-236.

21.

Wardle

Derakhshan

. Information Disorder: Toward an Interdisciplinary Framework for Research and Policymaking. Vol. 27. Council of Europe Strasbourg; 2017.

22.

Gwiaździński

Gundersen

Piksa

, et al. Psychological interventions countering misinformation in social media: A scoping review. Front Psychiatry. 2022;13:974782.

23.

Nabożny

Balcerzak

Morzy

Wierzbicki

Savov

Warpechowski

Improving medical experts’ efficiency of misinformation detection: an exploratory study. World Wide Web. 2023;26(2):773-798.

24.

Reuter

Lee Hughes

Buntain

Combating information warfare: state and trends in user-centred countermeasures against fake news and misinformation. Behav Inf Technol. 2025;44:3348-3361.

25.

Chou

Klein

WMP

. Addressing health-related misinformation on social media. JAMA. 2018;320(23):2417-2418.

26.

Suarez-Lledo

Alvarez-Galvez

Prevalence of health misinformation on social media: systematic review. J Med Internet Res. 2021;23(1):e17187.

27.

Krishna

Thompson

TL.

Misinformation about health: a review of health communication and misinformation scholarship. Am Behav Sci. 2021;65(2):316-332. doi:10.1177/0002764219878223

28.

Carlson

Journalistic epistemology and digital news circulation: infrastructure, circulation practices, and epistemic contests. New Media Soc. 2020;22(2):230-246.

29.

Swire-Thompson

Lazer

Public health and online misinformation: challenges and recommendations. Annu Rev Public Health. 2020;41(1):433-451.

30.

Zhong

Going beyond fact-checking to fight health misinformation: a multi-level analysis of the Twitter response to health news stories. Int J Inf Manag. 2023;70:102626.

31.

Darwish

Tashtoush

Bashayreh

Alomar

Alkhaza’leh

Darweesh

A survey of uncover misleading and cyberbullying on social media for public health. Cluster Comput. 2023;26(3):1709-1735.

32.

Cinelli

Quattrociocchi

Galeazzi

, et al. The COVID-19 social media infodemic. Sci Rep. 2020;10(1):16598-16610.

33.

Alsmadi

Rice

O’Brien

MJ.

Fake or not? Automated detection of COVID-19 misinformation and disinformation in social networks and digital media. Comput Math Organ Theory. 2022;30(3):1-19.

34.

Alsaad

AlDossary

Educational video intervention to improve health misinformation identification on WhatsApp among Saudi Arabian population: pre-post intervention study. JMIR Formative Research. 2024;8:e50211.

35.

Rodrigues

Newell

Rathnaiah Babu

Chatterjee

Sandhu

Gupta

The social media infodemic of health-related misinformation and technical solutions. Health Policy Technol. 2024;13:100846.

36.

Kalantari

Liao

Motti

. Characterizing the online discourse in Twitter: Users’ reaction to misinformation around COVID-19 in Twitter. In: 2021 IEEE International Conference on Big Data (Big Data). IEEE; 2021:4371-4380.

37.

Zhou

Xiu

Wang

Characterizing the dissemination of misinformation on social media in health emergencies: an empirical study based on COVID-19. Inf Process Manag. 2021;58(4):102554.

38.

Zhang

Zheng

Zhou

Fan

Understanding the health misinformation dissemination on Twitter: the perspective of tweets-comments consistency. Technol Soc. 2024;77:102547.

39.

Xue

Zhao

Zhu

Song

Mitigating the influence of message features on health misinformation sharing intention in social media: experimental evidence for accuracy-nudge intervention. Soc Sci Med. 2024;356:117136.

40.

Zhao

Zhu

Wan

, et al. Understanding how and by whom COVID-19 misinformation is spread on social media: coding and network analyses. J Med Internet Res. 2022;24(6):e37623.

41.

Ganti

Hussein

EAH

Wilson

Zhao

. Narrative style and the spread of health misinformation on twitter. In: Findings of the Association for Computational Linguistics: EMNLP 2023. 2023:4266-4282.

42.

Safarnejad

Krishnan

Bagarvathi

Chen

Contrasting misinformation and real-information dissemination network structures on social media during a health emergency. Am J Public Health. 2020;110(S3):S340-S347.

43.

Osude

O’Brien

Bosworth

HB.

The search for the missing link between health misinformation & health disparities. Patient Educ Couns. 2024;129:108386.

44.

Edinger

Valdez

Walsh-Buhi

, et al. Misinformation and public health messaging in the early stages of the mpox outbreak: mapping the twitter narrative with deep learning. J Med Internet Res. 2023;25:e43841.

45.

Banerjee

Mukhopadhyay

Sahana Asmeen

Javed

COVID-19 vaccination: crucial roles and opportunities for the mental health professionals. Glob Ment Health. 2021;8:e25.

46.

Dubey

Biswas

Ghosh

, et al. Psychosocial impact of COVID-19. Diabetes Metab Syndr Clin Res Rev. 2020;14(5):779-788.

47.

Zhong

BL.

Media consumption patterns and depressive and anxiety symptoms in the Chinese general population during the COVID-19 outbreak. World J Psychiatry. 2025;15(4):104625.

48.

Gao

Zheng

Jia

, et al. Mental health problems and social media exposure during COVID-19 outbreak. PLoS One. 2020;15(4):e0231924. doi:10.1371/journal.pone.0231924

49.

Pedro

Fernandes

Arent

, et al. Frequency and method of seeking for information about COVID-19 and its relationship with psychological symptoms and stress levels. An Acad Bras Cienc. 2025;97(2):e20231050.

50.

Hammad

Alqarni

TM.

Psychosocial effects of social media on the Saudi society during the Coronavirus Disease 2019 pandemic: a cross-sectional study. PLoS One. 2021;16(3):e0248811.

51.

Pan

Liu

Fang

An examination of factors contributing to the acceptance of online health misinformation. Front Psychol. 2021;12:630268.

52.

Piksa

Noworyta

Piasecki

, et al. Cognitive processes and personality traits underlying four phenotypes of susceptibility to (Mis)Information. Front Psychiatry. 2022;13:912397.

53.

Beauvais

Fake news: why do we believe it?

Joint Bone Spine. 2022;89(4):105371. doi:10.1016/j.jbspin.2022.105371

54.

Choukou

Sanchez-Ramirez

Pol

Uddin

Monnin

Syed-Abdul

COVID-19 infodemic and digital health literacy in vulnerable populations: a scoping review. Digit Health. 2022;8:20552076221076927.

55.

Scherer

McPhetres

Pennycook

, et al. Who is susceptible to online health misinformation? A test of four psychosocial hypotheses. Health Psychol. 2021;40(4):274.

56.

Escolà-Gascón Dagnall

Denovan

Drinkwater

Diez-Bosch

. Who falls for fake news? Psychological and clinical profiling evidence of fake news consumers. Pers Individ Dif. 2023;200:111893.

57.

Perlis

Ognyanova

Santillana

, et al. Association of major depressive symptoms with endorsement of COVID-19 vaccine misinformation among US adults. JAMA Netw Open. 2022;5(1):e2145697.

58.

Bodaghi

Schmitt

Watine

Fung

BCM

. A literature review on detecting, verifying, and mitigating online misinformation. IEEE Trans Comput Soc Syst. 2024;11:5119-5145.

59.

T (Jennifer)

Atkin

User generated content and credibility evaluation of online health information: a meta analytic study. Telematics Inform. 2017;34(5):472-486. doi:10.1016/j.tele.2016.09.009

60.

Alghamdi

Luo

Lin

A comprehensive survey on machine learning approaches for fake news detection. Multimed Tools Appl. 2024;83(17):51009-51067.

61.

Dai

Sun

Wang

. Ginger cannot cure cancer: battling fake health news with a comprehensive data repository. In: Proceedings of the International AAAI Conference on Web and Social Media. 2020:853-862.

62.

Safarnejad

Chen

A multiple feature category data mining and machine learning approach to characterize and detect health misinformation on social media. IEEE Internet Comput. 2021;25(5):43-51.

63.

Barve

Saini

JR.

Detecting and classifying online health misinformation with ‘content similarity measure (CSM)’ algorithm: an automated fact-checking-based approach. J Supercomput. 2023;79(8):9127-9156.

64.

Liu

Zhou

Zuo

Dual-process theory-driven transparent approach for seniors to accept health misinformation detection results. Inf Process Manag. 2024;61(4):103751.

65.

Enhancing health misinformation detection: a multidimensional feature framework incorporating linguistic strategies. Inf Process Manag. 2025;62(3):104039.

66.

Preston

Sun

, et al. Using machine learning-based approaches for the detection and classification of human papillomavirus vaccine misinformation: infodemiology study of Reddit discussions. J Med Internet Res. 2021;23(8):e26478.

67.

Hayawi

Shahriar

Serhani

Taleb

Mathew

SS.

ANTi-Vax: a novel Twitter dataset for COVID-19 vaccine misinformation detection. Public Health. 2022;203:23-30.

68.

Joshi

Srivastava

Yagnik

, et al. Explainable misinformation detection across multiple social media platforms. IEEE Access. 2023;11:23634-23646.

69.

World Health Organization. Responding to Community Spread of COVID-19: Interim Guidance. World Health Organization; 2020:2020.

70.

Cui

Lee

Coaid: Covid-19 healthcare misinformation dataset. arXiv preprint arXiv:200600885. 2020.

71.

Zhou

Mulay

Ferrara

Zafarani

. Recovery: a multimodal repository for covid-19 news credibility research. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020:3205-3212.

72.

Al-Rakhami

Al-Amri

AM.

Lies kill, facts save: detecting COVID-19 misinformation in twitter. IEEE Access. 2020;8:155961-155970.

73.

Paka

Bansal

Kaushik

Sengupta

Chakraborty

Cross-SEAN: a cross-stitch semi-supervised neural attention model for COVID-19 fake news detection. Appl Soft Comput. 2021;107:107393.

74.

Patwa

Sharma

Pykl

, et al. Fighting an infodemic: Covid-19 fake news dataset. In: Combating Online Hostile Posts in Regional Languages during Emergency Situation: First International Workshop, CONSTRAINT 2021, Collocated with AAAI 2021, Virtual Event, February 8, 2021, Revised Selected Papers 1. Springer; 2021:21-29.

75.

Shang

Kou

Zhang

Wang

. A multimodal misinformation detector for COVID-19 short videos on TikTok. In: 2021 IEEE International Conference on Big Data (Big Data). 2021:899-908. doi:10.1109/BigData52589.2021.9671928

76.

Khan

Hakak

Deepa

Prabadevi

Dev

Trelova

Detecting COVID-19-related fake news using feature extraction. Front Public Health. 2021;9:788074.

77.

Zamir

Ullah

Tariq

Bangyal

Arif

Gelbukh

Machine and deep learning algorithms for sentiment analysis during COVID-19: a vision to create fake news resistant society. PLoS One. 2024;19(12):e0315407.

78.

Koirala

. COVID-19 fake news classification with deep learning. Preprint. 2020:4.

79.

Jiang

Shu

Liu

Mm-covid: a multilingual and multimodal data repository for combating covid-19 disinformation. arXiv preprint arXiv:2011.04088. 2020.

80.

Yang

Zhou

Zafarani

CHECKED: Chinese COVID-19 fake news dataset. Soc Netw Anal Min. 2021;11(1):58.

81.

Dou

Xia

Cui

Philip

. Cross-lingual covid-19 fake news detection. In: 2021 International Conference on Data Mining Workshops (ICDMW). IEEE; 2021:859-862.

82.

Ghayoomi

Mousavian

Deep transfer learning for COVID-19 fake news detection in Persian. Expert Syst. 2022;39(8):e13008.

83.

Bozuyla

Özçi̇ft

Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data. Turk J Electr Eng Comput Sci. 2022;30(3):908-926.

84.

Bonet-Jover

Sepúlveda-Torres

Saquete

Martínez-Barco

Nieto-Pérez

RUN-AS: a novel approach to annotate news reliability for disinformation detection. Lang Resour Eval. 2024;58(2):609-639. doi:10.1007/s10579-023-09678-9

85.

Sicilia

Lo Giudice

Pei

Pechenizkiy

Soda

Twitter rumour detection in the health domain. Expert Syst Appl. 2018;110:33-40.

86.

Hossain

Rll

Ugarte

Matsubara

Young

Singh

. COVIDLies: detecting COVID-19 misinformation on social media. In: Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020. 2020.

87.

Haouari

Hasanain

Suwaileh

Elsayed

ArCOV19-rumors: Arabic COVID-19 twitter dataset for misinformation detection. arXiv preprint arXiv:201008768. 2020.

88.

Luo

Xue

El Baz

Combating the infodemic: a Chinese infodemic dataset for misinformation identification. Healthcare. 2021;9:1094.

89.

Cheng

Wang

Yan

, et al. A COVID-19 rumor dataset. Front Psychol. 2021;12:644801.

90.

Sarrouti

Abacha

M’rabet

Demner-Fushman

. Evidence-based fact-checking of health-related claims. In: Findings of the Association for Computational Linguistics: EMNLP 2021. 2021:3499-3512.

91.

Srba

Pecher

Tomlein

, et al. Monant medical misinformation dataset: mapping articles to fact-checked claims. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2022:2949-2959.

92.

Kim

Aum

Lee

Jang

Park

Choi

FibVID: comprehensive fake news diffusion dataset during the COVID-19 period. Telematics Inform. 2021;64:101688.

93.

Zhao

Yan

Detecting health misinformation in online health communities: incorporating behavioral features into machine learning based approaches. Inf Process Manag. 2021;58(1):102390.

94.

Memon

Carley

KM.

Characterizing covid-19 misinformation communities using a novel twitter dataset. arXiv preprint arXiv:200800791. 2020.

95.

Shahi

Nandini

FakeCovid–a multilingual cross-domain fact check news dataset for COVID-19. arXiv preprint arXiv:200611343. 2020.

96.

Dharawat

Lourentzou

Morales

Zhai

. Drink bleach or do what now? covid-hera: a study of risk-informed health decision making in the presence of covid-19 misinformation. In: Proceedings of the International AAAI Conference on Web and Social Media. 2022:1218-1227.

97.

Altheneyan

Alhadlaq

Big data ML-based fake news detection using distributed learning. IEEE Access. 2023;11:29447-29463.

98.

Mridha

Keya

Hamid

Monowar

Rahman

MS.

A comprehensive review on fake news detection with deep learning. IEEE Access. 2021;9:156151-156170.

99.

Choudrie

Banerjee

Kotecha

Walambe

Karende

Ameta

Machine learning techniques and older adults processing of online information and misinformation: a covid 19 study. Comput Human Behav. 2021;119:106716. doi:10.1016/j.chb.2021.106716

100.

Kharde

Sonawane

PS.

Sentiment analysis of Twitter data: a survey of techniques. arXiv. Published online April 22, 2016. doi:10.48550/arXiv.1601.06971

101.

Amjad

Sidorov

Zhila

Gómez-Adorno

Voronkov

Gelbukh

“Bend the truth”: benchmark dataset for fake news detection in Urdu language and its evaluation. J Intell Fuzzy Syst. 2020;39(2):2457-2469.

102.

Al-Ahmad

Al-Zoubi

Abu Khurma

Aljarah

An evolutionary fake news detection method for covid-19 pandemic information. Symmetry. 2021;13(6):1091.

103.

Qasem

Al-Sarem

Saeed

An ensemble learning based approach for detecting and tracking COVID19 rumors. Comput Mater Contin. 2022;70(1):1721-1747.

104.

Nistor

Zadobrischi

The influence of fake news on social media: analysis and verification of web content during the COVID-19 pandemic by advanced machine learning methods and natural language processing. Sustainability. 2022;14(17):10466.

105.

Mazzeo

Rapisarda

Giuffrida

Detection of fake news on COVID-19 on web search engines. Front Phys. 2021;9:685730.

106.

Dhoju

Main Uddin Rony

Ashad Kabir

Hassan

. Differences in health news from reliable and unreliable media. In: Companion Proceedings of the 2019 World Wide Web Conference. 2019:981-987.

107.

Mahara

Josephine

VLH

Srinivasan

Prakash

Algarni

Verma

. Deep vs. shallow: a comparative study of machine learning and deep learning approaches for fake health news detection. 2023. Accessed September 13, 2024. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10192389

108.

Sharma

Garg

IFND: a benchmark dataset for fake news detection. Complex Intell Syst. 2023;9(3):2843-2863. doi:10.1007/s40747-021-00552-1

109.

Madani

Erritali

Bouikhalene

Using artificial intelligence techniques for detecting Covid-19 epidemic fake news in Moroccan tweets. Results Phys. 2021;25:104266.

110.

Alhakami

Baz

Faizan

Khan

Agrawal

Evaluating intelligent methods for detecting covid-19 fake news on social media platforms. Electronics. 2022;11(15):2417.

111.

Narra

Umer

Sadiq

, et al. Selective feature sets based fake news detection for COVID-19 to manage infodemic. IEEE Access. 2022;10:98724-98736.

112.

Shams

Hoque Apu

Rahman

, et al. Web search engine misinformation notifier extension (SEMiNExt): a machine learning based approach during COVID-19 pandemic. Healthcare. 2021;9:156.

113.

Alghamdi

Lin

Luo

Towards COVID-19 fake news detection using transformer-based models. Knowl Syst. 2023;274:110642. doi:10.1016/j.knosys.2023.110642

114.

Iwendi

Mohan

Khan

Ibeke

Ahmadian

Ciano

Covid-19 fake news sentiment analysis. Comput Electr Eng. 2022;101:107967.

115.

Bangyal

Qasim

Rehman

, et al. Detection of fake news text classification on COVID-19 using Deep Learning approaches. Comput Math Methods Med. 2021;2021:5514220.

116.

Tomaszewski

Morales

Lourentzou

, et al. Identifying false human papillomavirus (HPV) vaccine information and corresponding risk perceptions from Twitter: advanced predictive models. J Med Internet Res. 2021;23(9):e30451.

117.

Albalawi

Buckley

Nikolov

NS.

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media. J Big Data. 2021;8(1):95.

118.

Al-Yahya

Al-Khalifa

Al-Baity

AlSaeed

Essam

Arabic fake news detection: comparative study of neural networks and transformer-based approaches. Complexity. 2021;2021(1):5516945.

119.

Abdelminaam

Ismail

Taha

Houssein

Nabil

CoAID-DEEP: an optimized intelligent framework for automated detecting COVID-19 misleading information on Twitter. IEEE Access. 2021;9:27840-27867. doi:10.1109/ACCESS.2021.3058066

120.

Di Sotto

Viviani

. Health misinformation detection in the social web: an overview and a data science approach. Int J Environ Res Public Health. 2022;19(4):2173.

121.

Elhadad

Gebali

Detecting misleading information on COVID-19. IEEE Access. 2020;8:165201-165215.

122.

Yuan

Jiang

Shen

Shi

Cheng

Sustainable development of information dissemination: a review of current fake news detection research and practice. Systems. 2023;11(9):458. doi:10.3390/systems11090458

123.

Zhang

Wallace

A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:151003820. 2015.

124.

Kaliyar

Goswami

Narang

Chamola

Understanding the use and abuse of social media: generalized fake news detection with a multichannel Deep Neural Network. IEEE Trans Comput Soc Syst. 2024;11(4):4878-4887. doi:10.1109/TCSS.2022.3221811

125.

Kumar

Mallik

Singh

RR.

Optnet-fake: fake news detection in socio-cyber platforms using grasshopper optimization and deep neural network. IEEE Trans Comput Soc Syst. 2024;11:4965-4974.

126.

Chen

Lai

YW.

Using fuzzy clustering with deep learning models for detection of COVID-19 disinformation. ACM Trans Asian Low Resour Lang Inf Process. 2022;21(4):33:1–33:25.

127.

Vaswani

Transformer: attention is all you need. Adv Neural Inf Process Syst. 2017;30:5998-6008.

128.

Devlin

Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805. 2018.

129.

Ding

Guo

Liu

, et al. EvolveDetector: towards an evolving fake news detector for emerging events with continual knowledge accumulation and transfer. Inf Process Manag. 2025;62(1):103878.

130.

Chen

Shu

Combating misinformation in the age of llms: opportunities and challenges. AI Magazine. 2024;45(3):354-368.

131.

Morita

Zakir Hussain

Kaur

Lotto

Butt

ZA.

Tweeting for health using real-time mining and artificial intelligence-based analytics: design and development of a big data ecosystem for detecting and analyzing misinformation on Twitter. J Med Internet Res. 2023;25:e44356.

132.

Dementieva

Panchenko

. Cross-lingual evidence improves monolingual fake news detection. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop. 2021:310-320.

133.

Malla

Alphonse

PJA

. Fake or real news about COVID-19? Pretrained transformer model to detect potential misleading news. Eur Phys J Spec Top. 2022;231(18-20):3347-3356.

134.

Das

Basak

Dutta

A heuristic-driven uncertainty based ensemble framework for fake news detection in tweets and news articles. Neurocomputing. 2022;491:607-620.

135.

Kumar

Singh

AK.

COVID-19 fake news detection using ensemble-based deep learning model. IT Prof. 2022;24(2):32-37.

136.

Malla

Alphonse

COVID-19 outbreak: an ensemble pre-trained deep learning model for detecting informative tweets. Appl Soft Comput. 2021;107:107495.

137.

Abd Elaziz

Dahou

Orabi

Alshathri

Soliman

Ewees

AA.

A hybrid multitask learning framework with a fire hawk optimizer for Arabic fake news detection. Mathematics. 2023;11(2):258.

138.

Qasim

Bangyal

Alqarni

Ali Almazroi

A fine-tuned BERT-based transfer learning approach for text classification. J Healthc Eng. 2022;2022:3498123.

139.

Ayoub

Yang

Zhou

Combat COVID-19 infodemic using explainable natural language processing models. Inf Process Manag. 2021;58(4):102569.

140.

Pan

Chen

Long

Zhang

PS.

A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst. 2021;32(1):4-24.

141.

Phan

Nguyen

Hwang

Fake news detection: a survey of graph neural network methods. Appl Soft Comput. 2023;139:110235. doi:10.1016/j.asoc.2023.110235

142.

Monti

Frasca

Eynard

Mannion

Bronstein

MM.

Fake news detection on social media using geometric deep learning. arXiv preprint arXiv:190206673. 2019.

143.

Liao

Chai

Han

, et al. An integrated multi-task model for fake news detection. IEEE Trans Knowl Data Eng. 2022;34(11):5154-5165.

144.

Karnyoto

Sun

Liu

Wang

Augmentation and heterogeneous graph neural network for AAAI2021-COVID-19 fake news detection. Int J Mach Learn Cybern. 2022;13(7):2033-2043.

145.

Cui

Kim

Shin

. Meta-path-based fake news detection leveraging multi-level social context information. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2022:325-334.

146.

Min

Rong

Bian

, et al. Divide-and-conquer: Post-user interaction network for fake news detection on social media. In: Proceedings of the ACM Web Conference 2022. 2022:1148-1158.

147.

Yue

Zeng

Kou

Shang

Wang

. Contrastive domain adaptation for early misinformation detection: A case study on covid-19. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2022:2423-2433.

148.

Yue

Zeng

Zhang

Shang

Wang

MetaAdapt: Domain adaptive few-shot misinformation detection via meta learning. arXiv preprint arXiv:230512692. 2023.

149.

Dhankar

Samuel

Hassan

Farruque

Bolduc

Zaïane

. Analysis of Covid-19 misinformation in social media using transfer learning. In: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE; 2021:880-885.

150.

Bonet-Jover

Piad-Morffis

Saquete

Martínez-Barco

Ángel García-Cumbreras

Exploiting discourse structure of traditional digital media to enhance automatic fake news detection. Expert Syst Appl. 2021;169:114340.

151.

Biradar

Saumya

Chauhan

Combating the infodemic: COVID-19 induced fake news recognition in social media networks. Complex Intell Syst. 2022;9(3):2879-2891.

152.

Serrano

JCM

Papakyriakopoulos

Hegelich

. NLP-based feature extraction for the detection of COVID-19 misinformation videos on YouTube. In: Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020. 2020.

153.

Zhang

Vakili Tahami

Abualsaud

Smucker

. Learning trustworthy web sources to derive correct answers and reduce health misinformation in search. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2022:2099-2104.

154.

Kumari

Ashok

Ghosal

Ekbal

Misinformation detection using multitask learning with mutual learning for novelty detection and emotion recognition. Inf Process Manag. 2021;58(5):102631.

155.

Tang

Zhang

, et al. DC-CNN: dual-channel Convolutional Neural Networks with attention-pooling for fake news detection. Appl Intell. 2023;53(7):8354-8369.

156.

Upadhyay

Pasi

Viviani

Vec4Cred: a model for health misinformation detection in web pages. Multimed Tools Appl. 2022;82(4):5271-5290.

157.

Fernández-Pichel

Losada

Pichel

JC.

A multistage retrieval system for health-related misinformation detection. Eng Appl Artif Intell. 2022;115:105211. doi:10.1016/j.engappai.2022.105211

158.

Al-Sarem

Alsaeedi

Saeed

Boulila

AmeerBakhsh

A novel hybrid deep learning model for detecting COVID-19-related rumors on social media based on LSTM and concatenated parallel CNNs. Appl Sci. 2021;11(17):7940.

159.

Wani

Agarwal

Bours

Impact of unreliable content on social media users during COVID-19 and stance detection system. Electronics. 2020;10(1):5.

160.

Apostol

Truică

Paschke

ContCommRTD: a distributed content-based misinformation-aware community detection system for real-time disaster reporting. IEEE Trans Knowl Data Eng. 2024;36(11):5811-5822. doi:10.1109/TKDE.2024.3417232

161.

Comito

Caroprese

Zumpano

Multimodal fake news detection on social media: a survey of deep learning techniques. Soc Netw Anal Min. 2023;13(1):101.

162.

Abdali

Shaham

Krishnamachari

Multi-modal misinformation detection: approaches, challenges and opportunities. ACM Comput Surv. 2025;57(3):1-29.

163.

Zhao

Zhang

Geng

Deep multimodal data fusion. ACM Comput Surv. 2024;56(9):1-36.

164.

Hou

Pérez-Rosas

Loeb

Mihalcea

. Towards automatic detection of misinformation in online medical videos. In: 2019 International Conference on Multimodal Interaction. 2019:235-243.

165.

Wang

Yin

Argyris

YA.

Detecting medical misinformation on social media using multimodal deep learning. IEEE J Biomed Health Inform. 2021;25(6):2193-2203.

166.

González-Fernández

Fernández-Isabel

Martín de Diego

Fernández

Viseu Pinheiro

JFJ

. Experts perception-based system to detect misinformation in health websites. Pattern Recognit Lett. 2021;152:333-339.

167.

Hua

Cui

Tang

Zhu

Multimodal fake news detection through data augmentation-based contrastive learning. Appl Soft Comput. 2023;136:110125.

168.

Shang

Kou

Zhang

Wang

. A duo-generative approach to explainable multimodal covid-19 misinformation detection. In: Proceedings of the ACM Web Conference 2022. 2022:3623-3631.

169.

Shang

Zhang

Deng

Wang

MultiTec: a data-driven multimodal short video detection framework for healthcare misinformation on TikTok. IEEE Trans Big Data. 2025;11:2471-2488. doi:10.1109/TBDATA.2025.3533919

170.

Raj

Meel

ARCNN framework for multimodal infodemic detection. Neural Netw. 2022;146:36-68.

171.

Hogan

Blomqvist

Cochez

, et al. Knowledge graphs. ACM Comput Surv. 2021;54(4):1-37.

172.

Cui

Seo

Tabar

Wang

Lee

. Deterrent: Knowledge guided graph attention network for detecting healthcare misinformation. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020:492-502.

173.

Shang

Zhang

Chen

, et al. MMAdapt: A Knowledge-guided Multi-source Multi-class Domain Adaptive Framework for Early Health Misinformation Detection. In: Proceedings of the ACM on Web Conference 2024. 2024:4653-4663.

174.

Lara-Navarra

Falciani

Sánchez-Pérez

Ferrer-Sapena

Information management in healthcare and environment: towards an automatic system for fake news detection. Int J Environ Res Public Health. 2020;17(3):1066.

175.

Weinzierl

Harabagiu

SM.

Automatic detection of COVID-19 vaccine misinformation with graph link prediction. J Biomed Inform. 2021;124:103955.

176.

Koloski

Stepišnik Perdih

Robnik-šikonja

Pollak

Škrlj

Knowledge graph informed fake news classification via heterogeneous representation ensembles. Neurocomputing. 2022;496:208-226.

177.

Kou

Shang

Zhang

Wang

Hc-covid: a hierarchical crowdsource knowledge graph approach to explainable covid-19 misinformation detection. Proc ACM Hum Comput Interact. 2022;6:1-25.

178.

Roitero

Soprano

Portelli

, et al. The covid-19 infodemic: can the crowd judge recent misinformation objectively? In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020:1305-1314.

179.

Kaufman

Haupt

Dow

SP.

Who’s in the crowd matters: cognitive factors and beliefs predict misinformation assessment accuracy. Proc ACM Hum Comput Interact. 2022;6(CSCW2):1-18.

180.

Nakov

Barrón-Cedeño

Da San Martino

, et al. The clef-2022 checkthat! lab on fighting the covid-19 infodemic and fake news detection. In: European Conference on Information Retrieval. Springer; 2022:416-428.

181.

Hammouchi

Ghogho

Evidence-aware multilingual fake news detection. IEEE Access. 2022;10:116808-116818.

182.

Martinez Monterrubio

Noain-Sánchez

Verdú Pérez

González Crespo

Coronavirus fake news detection via MedOSINT check in health care official bulletins with CBR explanation: the way to find the real information source through OSINT, the verifier tool for official journals. Inf Sci. 2021;574:210-237.

183.

Kou

Shang

Zhang

Yue

Zeng

Crowd

. Expert & AI: a human-AI interactive approach towards natural language explanation based COVID-19 misinformation detection. In: IJCAI. 2022:5087-5093.

184.

Haupt

Yang

Purnat

Mackey

Evaluating the influence of role-playing prompts on ChatGPT’s misinformation detection accuracy: quantitative study. JMIR infodemiology. 2024;4(1):e60678.

185.

van Nuland

Erdogan

Aςar

, et al. Performance of ChatGPT on factual knowledge questions regarding Clinical Pharmacy. J Clin Pharmacol. 2024;64:1095-1100.

186.

Johnson

King

Warner

Aneja

Kann

Bylund

CL.

Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information. JNCI Cancer Spectr. 2023;7(2):ad015.

187.

De Angelis

Baglivo

Arzilli

, et al. ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health. Front Public Health. 2023;11:1166120.

188.

Zhou

Zhang

Luo

Parker

De Choudhury

. Synthetic lies: Understanding ai-generated misinformation and evaluating algorithmic and human solutions. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 2023:1-20.

189.

Garbarino

Bragazzi

NL.

Evaluating the effectiveness of artificial intelligence-based tools in detecting and understanding sleep health misinformation: comparative analysis using Google bard and OpenAI ChatGPT-4. J Sleep Res. 2024;33:e14210.

190.

Choi

Ferrara

. Automated claim matching with large language models: empowering fact-checkers in the fight against misinformation. In: Companion Proceedings of the ACM on Web Conference 2024. 2024:1441-1449.

191.

Cohen

IG.

What should ChatGPT mean for bioethics?

Am J Bioeth. 2023;23(10):8-16.

192.

Fan

Kankanhalli

. Combating misinformation in the era of generative AI models. In: Proceedings of the 31st ACM International Conference on Multimedia. 2023:9291-9298.

193.

Ramzan

Abid

Khan

, et al. A review on state-of-the-art violence detection techniques. IEEE Access. 2019;7:107560-107575.

194.

Xiao

Ren

Guo

Luo

JQ.

Unraveling the effectiveness of new media teaching strategies in pharmacology education under different educational backgrounds: insights from 6447 students. Eur J Pharmacol. 2025;989:177255.

195.

Luo

Ahmad

Alyaemeni

, et al. Role of perceived ease of use, usefulness, and financial strength on the adoption of health information systems: the moderating role of hospital size. Humanit Soc Sci Commun. 2024;11(1):1-12.

196.

Baydili Tasci

Tasci

. Deep learning-based detection of depression and suicidal tendencies in social media data with feature selection. Behav Sci. 2025;15(3):352.

197.

Baydili Tasci

Tasci

. Artificial intelligence in psychiatry: a review of biological and behavioral data analyses. Diagnostics. 2025;15(4):434.

198.

World Medical Association. World Medical Association Declaration of Helsinki: ethical principles for medical research involving human participants. JAMA. 2025;333(1):71-74.

199.

Freund

Schapire

. A desicion-theoretic generalization of on-line learning and an application to boosting. In: European Conference on Computational Learning Theory. Springer; 1995:23-37.

200.

Yan

Jin

. Wind turbine generator fault detection based on multi-layer neural network and Random Forest Algorithm. In: 2019 IEEE Innovative Smart Grid Technologies - Asia (ISGT Asia). 2019:4132-4136. doi:10.1109/ISGT-Asia.2019.8881778

Health Misinformation Detection: Approaches,Challenges and Opportunities

Abstract

Keywords

Highlights

Introduction

Characteristics of Health Misinformation

Concept of Health Misinformation

Dissemination Mechanism of Health Misinformation

Psychological Impact of Health Misinformation

Susceptibility to Health Misinformation

Datasets of Health Misinformation

Binary Categorical Datasets

Multi-Categorical Datasets

Evaluation Metrics for Health Misinformation Detection

Machine Learning Detection Methods

Performance Evaluation Among Machine Learning Algorithms

Comparative Analysis of Machine Learning and Other Models

Deep Learning Detection Methods

Unimodal Misinformation Detection

Traditional Deep Learning Models

Attention-Based Deep Learning Models

Graph Neural Networks

Transfer Learning Models

Hybrid Models

Multimodal Misinformation Detection

Early Fusion

Intermediate Fusion

Late Fusion

Other Advanced Detection Methods

Knowledge Graph-Based Methods

Fact-Checking-Based Methods

Large Language Models

Discussion

Main Findings

Feature Variability

Dataset Diversity and Metric Reliability

Algorithmic Synergy and Hybrid Optimization

Deep Learning Complementarity and Multimodal Integration

Other Advanced Method Efficacy and Emerging Risks

Limitations

Data-Related Limitations

Methodological Limitations

Evaluation and Application Limitations

Future Research Directions

Enhancing Data Quality and Diversity

Methodological and Model Innovation

Improving Evaluation and Real-World Applications

Conclusion

Footnotes

Appendix

Acknowledgements

Author’s Note

ORCID iD

Ethical Considerations

Consent to Participate

Author Contributions

Funding

Declaration of Conflicting Interests

Data Availability Statement

Guarantor

References