Sage Journals: Discover world-class research

Abstract

Tweets serve as the primary source of diverse digital data, providing information about multitude of topics. Every day, people send a large number of tweets, making it impossible for a human to discern which ones are relevant to their health. Health authorities may overlook or underestimate the significance of health-related tweets, potentially leading to public health awareness policies and impeding the advancement of their healthcare system. This research examined which tweets, within the context of health-related topics and news, pertain to people's health and which do not. The research used the health-related tweets dataset from four top news media companies worldwide. The study employed Long Short-Term Memory (LSTM) to detect health-related tweets from a news tweet dataset, due to its versatility and robustness in handling diverse text sources, which is essential across various healthcare systems and contexts. The performance analysis reveals that the “Reuters health-related tweets news dataset” yields the best results, achieving an accuracy of 99.60%, precision of 99.63%, recall of 99.60%, false positive rate (FPR) of 0.40%, and F1 score of 99.61%. The research provides insight into the world of detecting health-related user tweets in terms of methodology. The research found that trends can be identified to detect health-related topics across various timeframes. Future work can extend this model to detect different kinds of infectious diseases.

Keywords

Tweets healthcare-related tweet LSTM

Get full access to this article

View all access options for this article.

References

Ahne

Khetan

Tannier

, et al. (2021) Identifying causal relations in tweets using deep learning: Use case on diabetes-related tweets from 2017-2021. arXiv preprint: arXiv:2111.01225. https://doi.org/10.48550/arXiv.2111.01225.

Alam

Dalvi

Shaar

, et al. (2021) Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms. arXiv preprint: arXiv:2101.10813. https://doi.org/10.48550/arXiv.2007.07996.

Bansal

Goyal

Choudhary

(2022) A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and LSTM algorithms in machine learning. Decision Analytics Journal 3: 100071.

Benítez-Andrades

García-Ordás

Russo

, et al. (2024) Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts. arXiv preprint: arXiv:2402.05536. https://doi.org/10.48550/arXiv.2402.05536.

Bishal

Chowdory

MRH

Das

, et al. (2024) COVIDHealth: A benchmark twitter dataset and machine learning-based web application for classifying COVID-19 discussions. arXiv preprint: arXiv:2402.09897. https://doi.org/10.48550/arXiv.2402.09897.

Boulos

MNK

Koh

(2021) Smart city lifestyle sensing, big data, geo-analytics and intelligence for smarter public health decision-making in overweight, obesity and type 2 diabetes prevention: The research we should be doing. International Journal of Health Geographics 20: 1–10.

Cinelli

Quattrociocchi

Galeazzi

, et al. (2020) The COVID-19 social media infodemic. Scientific Reports 10(1): 16598.

Doyle

Link

(2024) On social health: History, conceptualization, and population patterning. Health Psychology Review 10: 1–30.

Flach

Kull

(2015) Precision-recall-gain curves: PR analysis done right. In: Advances in Neural Information Processing Systems 28: 1–10. https://papers.nips.cc/paper_files/paper/2015/file/33e8075e9970de0cfea955afd4644bb2-Paper.pdf .

10.

Foody

(2023) Challenges in the real-world use of classification accuracy metrics: From recall and precision to the matthews correlation coefficient. PLOS One 18(10): e0291908.

11.

Graves

(2012) LSTM. In: Supervised Sequence Labelling with Recurrent Neural Networks. Berlin, Germany: Springer, 37–45. https://doi.org/10.1007/978-3-642-24797-2_4 .

12.

Gui

Kou

Pine

, et al. (2017) Managing uncertainty: Using social media for risk assessment during a public health crisis. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp.4520–4533. https://doi.org/10.1145/3025453.3025891 .

13.

Habib

AZSB

Syed

MAB

Islam

, et al. (2023) Cardiovascular disease risk prediction via social media. arXiv preprint: arXiv:2309.13147. https://doi.org/10.48550/arXiv.2309.13147

14.

Health Related Tweets: Tweets from major Newspapers and Magazines on Health Related Topics (n.d) [online] Available at: https://www.kaggle.com/datasets/prabhavsingh/health-related-tweets [Accessed 12 Dec. 2023].

15.

Jalil

Abbasi

Javed

, et al. (2022) COVID-19 Related sentiment analysis using state-of-the-art machine learning and deep learning techniques. Frontiers in Public Health 9: 812735.

16.

Karami

Gangopadhyay

Zhou

, et al. (2018) Fuzzy approach topic discovery in health and medical corpora. International Journal of Fuzzy Systems 20: 1334–1345.

17.

Khan

Fouda

D-T

, et al. (2023) Short-term traffic prediction using deep learning LSTM: Taxonomy, applications, challenges, and future trends. IEEE Access 11: 94371–94391.

18.

Kouzy

Abi Jaoude

Kraitem

, et al. (2020) Coronavirus goes viral: Quantifying the COVID-19 misinformation epidemic on twitter. Cureus 12(3): e7255. https://pubmed.ncbi.nlm.nih.gov/32292669/ .

19.

Kunneman

Lambooij

Wong

, et al. (2020) Monitoring stance towards vaccination in twitter messages. BMC Medical Informatics and Decision Making 20(1): 99.

20.

Lindemann

Müller

Vietz

, et al. (2021) A survey on LSTM networks for time series prediction. Procedia CIRP 99: 650–655.

21.

Lwin

Sheldenkar

, et al. (2020) Global sentiments surrounding the COVID-19 pandemic on twitter: Analysis of twitter trends. JMIR Public Health and Surveillance 6(2): e19447.

22.

Memon

Carley

(2020) Characterizing COVID-19 misinformation communities using a novel twitter dataset. arXiv preprint: arXiv:2004.12087. https://doi.org/10.48550/arXiv.2008.00791

23.

Mitra

Counts

Pennebaker

(2016) Understanding anti-vaccination attitudes in social Media. In: Proceedings of the International AAAI Conference on Web and Social Media, pp.269–278. https://doi.org/10.1609/icwsm.v10i1.14729

24.

Mollema

Harmsen

Broekhuizen

, et al. (2015) Disease detection or public opinion reflection? Content analysis of tweets, other social media, and online news about the measles outbreak in The Netherlands in 2013. Journal of Medical Internet Research 17(5): e128.

25.

Murthy

(2024) Sociology of Twitter/X: Trends, challenges, and future research directions. Annual Review of Sociology 50(1): 169–190. https://doi.org/10.1146/annurev-soc-031021-035658

26.

Sarker

DeRoos

Perrone

(2020) Mining social media for prescription medication abuse monitoring: A review and proposal for a data-centric framework. Journal of the American Medical Informatics Association 27(4): 315–329.

27.

Sharifpoor

Zhang

Lin

(2025) Deep HealthVerifier: A transformer-based model for fact-checking health claims on social Media. Journal of Health Data Science 4(2): 101–118.

28.

Sharma

Yadav

, et al. (2017) Zika virus pandemic—analysis of Facebook as a social media health information platform. American Journal of Infection Control 45(3): 301–302.

29.

Van Houdt

Mosquera

Nápoles

(2020) A review on the LSTM model. Artificial Intelligence Review 53(8): 5929–5955.

Detecting health-related tweets for developing healthcare systems

Abstract

Keywords

Get full access to this article

References