Abstract
Tweets serve as the primary source of diverse digital data, providing information about multitude of topics. Every day, people send a large number of tweets, making it impossible for a human to discern which ones are relevant to their health. Health authorities may overlook or underestimate the significance of health-related tweets, potentially leading to public health awareness policies and impeding the advancement of their healthcare system. This research examined which tweets, within the context of health-related topics and news, pertain to people's health and which do not. The research used the health-related tweets dataset from four top news media companies worldwide. The study employed Long Short-Term Memory (LSTM) to detect health-related tweets from a news tweet dataset, due to its versatility and robustness in handling diverse text sources, which is essential across various healthcare systems and contexts. The performance analysis reveals that the “Reuters health-related tweets news dataset” yields the best results, achieving an accuracy of 99.60%, precision of 99.63%, recall of 99.60%, false positive rate (FPR) of 0.40%, and F1 score of 99.61%. The research provides insight into the world of detecting health-related user tweets in terms of methodology. The research found that trends can be identified to detect health-related topics across various timeframes. Future work can extend this model to detect different kinds of infectious diseases.
Get full access to this article
View all access options for this article.
