Exploring the semantics in mental health research

Abstract

Mental health is an increasingly important public health issue. Based on data from the National Survey on Drug Use and Health (NSDUH), 1 in 5 adults experience a mental illness, and nearly 1 in 25 adults lives with a serious mental illness in the United States.¹ Especially, the annual suicide rate in the United States has continued to climb over the past several decades and suicide is now the 10th leading causes of death in the United States.

On the contrary, there has been a rapid growth in the implementation of electronic health records (EHRs), leading to an unprecedented expansion in the availability of dense longitudinal data sets for clinical and translational research for psychiatric disorders. Meanwhile, the rapidly increasing, huge archive of consumer data from social media platforms such as Twitter and Facebook also provide unprecedented opportunities to access a broad population with the mental health issues. Therefore, recent years have witnessed a rapid growth of “big data” studies aiming to extract and study risk factors, phenotyping information, and human behaviors from EHRs and social media data. Nevertheless, these extracted data are rarely standardized and have poor semantic interoperability. These heterogeneous data sets need to be formally represented using an ontological and semantic framework for downstream analyses, applications, and reasoning. However, psychiatric information often shows very unique characteristics, such as subjective descriptions of patient experience and idiosyncratic psychosocial backgrounds, leading to challenges of data sparseness and diversity. Novel natural language processing (NLP), ontology, and semantic web technologies are needed to address these challenges.

This Special Issue, “Semantics of Mental Health,” is a collection of articles that extend the authors’ presentations in the first international workshop on the semantics of mental health (SemanticsMH 2018) hosted in conjunction with the six IEEE international conference on health care informatics (ICHI 2018) in the New York City. The SemanticsMH 2018 was a huge success. In addition to authors presenting their work, two keynote speakers—Drs Jessie Tenenbaum and Piper Ranallo—from the Mental Health Informatics Working Group (WG) of the American Medical Informatics Association (AMIA) presented a WG report titled, “Mental Health Informatics and Knowledge Representation.” The six articles in this Special Issue are showcases of the state-of-the-art research and development efforts across a wide of fields, including NLP, knowledge representation, knowledge management, and data science, demonstrating innovative methods, applications, and tools to address problems in mental health.

Duan et al compared pharmacovigilance outcomes reported in the Food and Drug Administration’s (FDA) Adverse Event Reporting System (FAERS) with those that were reported in real-world data from EHRs for 12 antidepressant drugs. Notably, they harmonized the different terminologies used in EHRs (i.e. International Classification of Diseases (ICD) codes) and FAERS (i.e. Medical Dictionary for Regulatory Activities (MedDRA) to make the comparison between the two different data sources possible. They showed that both FAERS and EHR data have strength and limitations for postmarketing pharmacovigilance; the results from the two data sources showed low consistency; and thus, more sophisticated informatics and statistical modeling tools need to be developed to bridge these gaps for evidence synthesis.

Fouladvand et al, however, used an administrative claims database, Truven, to develop prediction models of substance use disorder from long-term attention deficit hyperactivity disorder (ADHD) medication records. They explored a number of state-of-the-art deep learning models (i.e. recurrent neural networks (RNNs)) to explore the rich information in temporal health care data. Their best model achieved an F1-score of 0.82 using a long-short term memory (LSTM) architecture.

Li et al did a more traditional ontology work in transforming the Research Domain Criteria (RDoC) matrix into an ontological structure. The National Institute of Mental Health (NIMH) in the United States launched the RDoC project in 2009 to create a framework for research on pathophysiology, especially genomics and neuroscience. One key goal of RDoC is to understand the nature of mental illnesses and ultimately inform future classification of mental disorders. Nevertheless, existing RDoC elements have limitations, especially for data normalization. An ontology development effort for RDoC is certainly worth applauding.

The last three articles, from Luo et al, Wang et al, and Zhao et al, are a cluster of studies that leveraged social media data. In particular, all three analyzed Twitter data for different mental health conditions. Luo et al examined temporal suicidal behavior pattern, Wang et al explored the associations between dietary supplement intake and sentiments with mental disorder-related tweets, and Zhao et al assessed the mental health signals among sexual and gender minorities. The three studies showed the rising popularity of exploring social media data sets. Nevertheless, existing social media studies are mostly exploratory and researchers who use these data sets need to think carefully about the inherited methodological and data issues, such as data stability and data representativeness. Nevertheless, utilizations and applications of social media in health research certainly warrant further in-depth investigations.²

References

Stambaugh

Forman-Hoffman

Williams

, et al. Prevalence of serious mental illness among parents in the United States: results from the National Survey of Drug Use and Health, 2008-2014. Ann Epidemiol 2017; 27: 222–224.

Bian

Guo

, et al. (eds). Social Web and Health Research: Benefits, Limitations, and Best Practices. Cham: Springer.