Abstract
Detecting mental illness from short social media posts is challenging because these texts are often brief, fragmented, and lack explicit descriptions of the user’s mental state. Prior studies using encoder-based models such as BERT show promise but struggle when key contextual information is missing. To address this, we propose a method that augments posts with interpretive sentences generated by MentaLLaMA-chat, a generative model specialized in mental health, and fine-tunes BERT on the augmented dataset. We curated 1,525 Japanese posts containing the word “mental” (in katakana) from X (formerly Twitter) and manually annotated them according to Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition criteria, labeling 557 posts as positive and 968 as negative. Our method improved recall by 2.4 percentage points compared to models trained on the original posts alone, while maintaining comparable accuracy and precision. Shapley Additive Explanations analysis revealed that tokens introduced by the interpretive sentences—including both negative and positive expressions—enhanced the model’s ability to identify mental-distress posts. These results demonstrate that generative-model-based text augmentation effectively provides additional context, enabling more accurate detection of mental illness indicators in short, ambiguous social media posts.
Get full access to this article
View all access options for this article.
