Abstract
While public health organizations can detect disease spread, few can monitor and respond to real-time misinformation. Misinformation risks the public’s health, the credibility of institutions, and the safety of experts and front-line workers. Big Data, and specifically publicly available media data, can play a significant role in understanding and responding to misinformation. The Public Good Projects uses supervised machine learning to aggregate and code millions of conversations relating to vaccines and the COVID-19 pandemic broadly, in real-time. Public health researchers supervise this process daily, and provide insights to practitioners across a range of disciplines. Through this work, we have gleaned three lessons to address misinformation. (1) Sources of vaccine misinformation are known; there is a need to operationalize learnings and engage the pro-vaccination majority in debunking vaccine-related misinformation. (2) Existing systems can identify and track threats against health experts and institutions, which have been subject to unprecedented harassment. This supports their safety and helps prevent the further erosion of trust in public institutions. (3) Responses to misinformation should draw from cross-sector crisis management best practices and address coordination gaps. Real-time monitoring and addressing misinformation should be a core function of public health, and public health should be a core use case for data scientists developing monitoring tools. The tools to accomplish these tasks are available; it remains up to us to prioritize them.
This article is a part of special theme on Studying the COVID-19 Infodemic at Scale. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/studyinginfodemicatscale
Introduction
Information spreads as fast as disease. Since March 2020, the world has been inundated with an avalanche of misinformation about nearly every aspect of the COVID-19 pandemic, including vaccine safety, public health guidelines, and the legitimacy of health authorities and science (Andersen et al., 2020; Cook et al., 2020; Roozenbeek et al., 2020). While public health organizations are skilled at tracking and predicting disease spread, most are incapable of tracking and responding to disease-related misinformation as it happens. There now appears to be little doubt this has contributed to the negative health impacts of what the World Health Organization terms an infodemic: a deadly, confusing swirl of too much information, and too much misinformation in particular (Swire-Thompson and Lazer, 2020).
Since 2017, The Public Good Projects (PGP) has used media monitoring to identify and track various health topics in large datasets comprised of public media data. Topics have included mental health, opioids, tobacco products, domestic violence, immunization, and most recently COVID-19. In June 2019, we created Project Vaccine Communication Tracking and Response (VCTR, at projectvctr.com), a full-time effort to quantify and categorize all communications related to vaccines across a broad range of media sources (Project VCTR, 2021). In March 2020, we created Project Rapid Collection Analysis Interpretation Dissemination (RCAID, at rcaid.org), to supplement Project VCTR’s data with media data related to COVID-19 generally (Project RCAID, 2021). Both projects focus primarily on identifying misinformation to which the public may become exposed. Both rely on supervised machine learning to collect, aggregate, analyze, and code millions of messages about vaccines and COVID-19 into thematic categories tracked over time (Bonnevie et al., 2020a). Insights from these analyses include communications recommendations to various stakeholders such as health departments and health systems, non-governmental organizations, researchers, and journalists. As of February 2021, PGP has collected, analyzed, and coded over 140 million mentions of vaccines within Project VCTR and over 875 million mentions of COVID-19 within Project RCAID.
This article presents learnings PGP has gleaned from this monitoring, focused on three issues in particular: vaccine opposition, attacks against health authorities and science, and rapid responses to misinformation. Methods of Big Data analysis can assist in more effectively addressing these issues.
Vaccine opposition—It’s not ‘what we know’; it’s ‘how we use it’
Vaccine opposition and hesitancy are threats to global health, with experts citing vaccine hesitancy as one of the biggest challenges facing the world (UNICEF, 2019; WHO, 2019). Vaccine opposition has increased with the COVID-19 pandemic, as lies, conspiracy theories, and misunderstandings about vaccines run rampant (Bonnevie et al., 2020a; Roose, 2020; Tardáguila, 2020; Zhang, 2020). Vaccine misinformation threatens to erode decades of progress in preventing disease spread and limits the effectiveness of mass immunization campaigns for COVID-19. We have seen vaccine opposition increase by 99.8% across data sources that we monitor. On Twitter specifically, there has been a 212.8% increase in accounts promoting vaccine opposition between March and December 2020 compared to the nine months prior (Project VCTR, 2020). While vaccine opposition has increased during the pandemic, routine vaccinations have decreased globally (UNICEF, 2020; WHO, 2020a, 2020b). The void created by health authorities pivoting their resources to respond to the pandemic has been filled with misinformation.
Vaccine opposition has infected digital and social media (Getman et al., 2017; Witteman and Zikmund-Fisher, 2012). Various studies have identified conversation themes and strategies of the anti-vaccination movement (Betsch and Sachse, 2012; Blankenship, 2018; Bonnevie et al., 2020b; Broniatowski et al., 2018; Deiner et al., 2019; Hoffman et al., 2019; Kata, 2012; Lutkenhaus et al., 2019; Meyer et al., 2019; Mollema et al., 2015; Shoup, 2019). We have seen vaccine opponents become emboldened, capitalizing on understandable hesitancy regarding novel COVID-19 vaccines to sow distrust in all immunizations, and against health authorities, government, and pharmaceutical companies (Bonnevie et al., 2020a). Yet our analyses conclusively demonstrate that the sources of vaccine misinformation are readily identifiable. Dominant talking points originate from a handful of highly influential, well-connected accounts (Bonnevie et al., 2020b). The narratives found within vaccine opposition have remained fairly consistent when compared to time periods prior to the pandemic, with one notable exception being the targeting of health authorities (Smyser, 2020).
Public health has the ability to determine what vaccine opponents are saying, where their words are spreading, and who is predominantly responsible for that spread. However, the field has struggled with taking action based on that knowledge. Social media platforms have significantly strengthened their ability to flag misinformation, though these systems are still far from perfect. The world is better prepared to address vaccine misinformation than ever before. Now public health must be held just as accountable as the social media companies for taking action.
The pro-science, pro-vaccination majority of the general public has historically been comparatively silent on the importance of immunizations (particularly in comparison to their vaccine opposing counterparts), and largely disengaged from the work of debunking vaccine-related misinformation. Perhaps the problem of misinformation has been so well publicized in the scientific literature and popular press that people incorrectly believe a unified global effort is underway, closely coordinating with national and local partners. This is not the case. Were this to become reality, we believe a pro-science majority comprised of cross-sector partners could quickly become more organized and effective than the current global anti-vaccination movement.
Collective responsibility and data monitoring should extend to supporting health officials
Health experts and officials at all levels have been targeted for their support of basic health principles. Just as vaccine misinformation can be tracked, so can attacks on health authorities and practitioners. Our data shows that the COVID-19 era has corresponded with an unprecedented onslaught of attacks on health experts. Our systems identified threats made against the United States’ top disease expert Dr Anthony Fauci in March 2020 (Alba and Frenkel, 2020). Since then, online harassment against him and other experts have only increased; from March 2020 to February 2021, the two hashtags #FireFauci and #FauciFraud have been used on Twitter over 470,000 times, or an average of 1359 times per day. Harassment, some serious and including threats of physical violence, occurs daily and is directed toward those who simply espouse basic public health tenets. State and local health workers have found their private information published online, armed protestors appearing at their houses, harassing phone calls and social media posts, and physical threats (Hart, 2020; Mello et al., 2020; National Governors Association, 2020; Smith and Weber, 2020; Van Beusekom, 2020). Since the start of the pandemic, dozens of health officials in the United States (US) have resigned, with many citing harassment as a reason for their departure. Contact tracers have also experienced threats, often driven by misinformation that the government or pharmaceutical industry are using the pandemic to engage in large-scale surveillance or population control (Dellinger, 2020; Stone, 2020).
Verbal assault is a risk factor for future physical violence (National Governors Association, 2020) and carries the risk of impacting long-term perceptions of public health. Since the beginning of the pandemic, we have observed increases in skepticism toward the public health community. Since March 2020, vaccine opponents on Twitter have shown a 217.8% increase in mentions of national health authorities, compared to the previous nine months. Vaccine opponents have paired their anti-vaccine messages with criticism of individual health experts, which seems likely to have contributed to the growing attacks against them. The societal impact of this concerted effort to sow distrust in health authorities and institutions will likely have long-term consequences.
The public health crisis response requires collaboration
Public health now exists in the world of QAnon and populist leaders strategically undermining trust in public institutions. While the weaponization of fake news is nothing new to public health (Bernard, 2021), as we enter the third decade of the 21st Century many of the assumptions underpinning traditional public health communications are no longer valid. Facts no longer speak for themselves. Communication from leaders is not one-dimensional. The public does not obtain its information from the same place, and misinformation fills any communication vacuum, at any geographic level.
Public health departments are therefore understandably ill equipped to unilaterally counter misinformation and threats to their staff, especially while fighting the largest pandemic in a century. Three contributing factors are ripe for cross-sector collaboration: (1) Communication is secondary to the technical research and fieldwork unique to public health; (2) The framework for public health crisis response communications relies on a 20th-century model; and (3) Public health departments are under-resourced, made worse by the pandemic, making it all but impossible to allocate extra staff and grow technical capability. The field of infodemiology is emerging to improve how public health prevents and mitigates misinformation (Kirk Sell et al., 2021). In June 2020, the World Health Organization convened a summit to begin mapping out the field of infodemiology, “the science of managing infodemics,” comparing pathogens in epidemics to misinformation in health emergency response (WHO, 2020c). Cross-sector collaboration at the individual, organizational, and policy levels will be crucial to the success of this new field.
Big Data already plays an essential role in public health: researchers conduct systematic evaluation of morbidity (illness) and mortality (deaths) to identify causes, modes of transmission, and appropriate control and prevention measures for a wide range of health topics and applications. However, the massive amount of media data leveraged by businesses, publishers, national security organizations, and other sectors continues to be foreign to most epidemiologists.
The World Health Organization’s proposed actions to tackle the COVID-19 infodemic could not come at a more important time (Kirk Sell et al., 2021; WHO, 2020c). The foundational public health crisis communications training in the US, Crisis and Emergency Risk Communication (CERC), centers around communicating key information to the public during an event (Centers for Disease Control and Prevention (CDC), 2018). Current resources draw mainly from examples such as environmental disasters and infectious disease outbreaks such as Zika (CDC, 2018). Training focuses on how to select and prepare the most appropriate public speaker to share consistent, trustworthy, and evidence-based information. CERC guidance mostly assumes one-way communication, harkening back to the days when the public received its information from television, print, and radio. It does not mention systematically tracking misinformation or protecting the reputation and safety of speakers, or the need to collaborate with influential stakeholders. The CERC Pandemic Influenza manual in particular has not been revised since 2007 (CDC, 2007).
While health communications resources exist across US federal agencies, mentions of misinformation have only begun to emerge, if referenced at all. The US National Institutes of Health (NIH) published an article in February 2021 acknowledging the “rapid-fire” spread of misinformation, and search results for published peer-reviewed studies are available if deliberately seeking misinformation-related keywords (NIH, 2021). As of February 2021, neither the US CDC nor the US Department of Health and Human Services (HHS) appear to provide public resources specific to misinformation that has emerged during the COVID-19 pandemic. The US Cybersecurity and Information Security Agency (CISA) and the US Federal Emergency Management Agency (FEMA) have published elementary resources addressing COVID-19 “disinformation” and “rumors”, respectively. FEMA’s guidance is specific to rumors about the agency, rather than misinformation threatening emergency response effectiveness at large (FEMA, 2021). CISA provides a COVID-19 Disinformation Toolkit to help officials bring awareness to COVID-19 misinformation origins, scale, response, prevention, and treatment (CISA, 2021). Absent from these sites is guidance on how address myths, falsehoods, or threats to public officials.
Crisis communications’ frameworks and tools from other disciplines would add tremendous value to how public health modernizes their approach to tackling misinformation. While sector and organizational nuances exist, crisis communication workflows generally fall into three main phases: pre-crisis planning, crisis response, and post-crisis learning. Media monitoring and Big Data analysis can be an integral resource within each phase. The media intelligence firm Zignal Labs, PGP’s partner on Project RCAID, recommends managing media crises by analyzing “Three V’s:” Volume (e.g., the amount of media data), Velocity (e.g., the rate at which media is spread), and Variety (e.g., the range of sources, channels and topics of media data; Zignal Labs, 2020). Benchmarks and thresholds can parallel key epidemiological indicators like the basic reproduction number “R0”, defined as the average number of cases transmitted by an infected individual. These types of indicators can help teams set priorities when anticipating a crisis, automate alerts to know when a situation should be treated as a crisis, and evaluate the effectiveness of their response.
Public health in the US can also benefit from approaches developed in other countries and other fields, such as the RESIST Countering Disinformation Toolkit from the UK Government Communication Service(2021). This model points to the role of media monitoring within specific steps of the process, while also emphasizing the importance of data-based decision-making. Additionally, in partnership with UNICEF, First Draft, the Yale Institute for Global Health, and PGP, the Vaccine Misinformation Management Field Guide provides guidance on developing strategic national action plans informed by social listening, to counter misinformation about vaccines globally (United Nations Children’s Fund, 2020). Journalists around the world are also trained on how to manage personal attacks due to their reporting. The Poynter Institute, a center within the Craig Newmark Center for Ethics and Leadership, provides guidance for journalists who are victims of personal attacks (Poynter, 2013). Social media marketing specialists consider crisis management a component of their work, with processes, monitoring tools, and content publishing platforms to make workflows as efficient and effective as possible. Platform developers such as Hootsuite provide toolkits and technical support for managing social media crises (Hootsuite, 2019). While the cost of some media monitoring platforms may be outside the scope of organizations with limited financial resources, such as health departments, free tools are available (First Draft, 2021).
Aside from staff capacity and funding (Kaiser News Network, 2020), a key barrier to adoption is that these tools are not designed for public health use cases. The most sophisticated, user-friendly media monitoring platforms are made for businesses to segment customers and guide them along a sales funnel. Academic platforms can be among the most robust and reliable but are not designed for rapid communication response workflows. COVID-19 dashboards, like Critical Trends from Johns Hopkins University & Medicine (2021), state and county-level tracking from the CDC’s COVID Data Tracker (CDC, 2021) and the COVID Tracking Project (COVID Tracking Project, 2021), and reported patient impact and hospital capacity data from HHS (Department of Health and Human Services, 2021), among others, have become essential to tracking the pandemic in the US. The nascent field of infodemiology would benefit from the combined expertise of the customer-centric private sector and rigorous subject-matter expertise of the public sector. This also presents career opportunity for specialists across multiple fields to help tackle the COVID-19 infodemic and drive the field of infodemiology forward.
Conclusion
We have witnessed the susceptibility of the world to misinformation. Now we must act with purpose to regain ground lost. Universities with public health or health communications programs should incorporate infodemiology as a core aspect of coursework; likewise, data science degrees should incorporate education about public health. Current public health workers should be trained to analyze health-related media data in a way that becomes as routine as their monitoring of disease-related data. Health departments and public health programs should consider monitoring of health conversations and of threats against their health workers with the same regularity and normalcy as they have for monitoring disease. Real-time monitoring of health-related conversations and threats against health authorities at all levels, and reacting to those as swiftly as possible, should be standard practice in public health. The strategies used to accomplish this are not new—the corporate, cyber security, and journalism fields employ them as standard practice. The tools and training needed to accomplish these recommendations are available. Now it is a decision of whether, and when, to use them.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
