LLM-Augmented Therapy Normalization and Aspect-Based Sentiment Analysis for Treatment-Resistant Depression on Reddit

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the underrepresentation of real-world patient experiences with pharmacological treatments in clinical trials for treatment-resistant depression (TRD). To bridge this gap, the authors curated a corpus of 23,399 drug mentions from TRD-related posts on Reddit and developed a fine-grained aspect-based sentiment analysis approach. Their method integrates large language models for drug name normalization and data augmentation, followed by fine-tuning of DeBERTa-v3. Evaluated on the SMM4H test set, the model achieves a micro F1-score of 0.800, marking the first large-scale, drug-oriented sentiment quantification in TRD patient-generated content. The analysis reveals that conventional antidepressants are associated with higher proportions of negative sentiment, whereas ketamine-class drugs elicit more positive evaluations.

Technology Category

Application Category

📝 Abstract
Treatment-resistant depression (TRD) is a severe form of major depressive disorder in which patients do not achieve remission despite multiple adequate treatment trials. Evidence across pharmacologic options for TRD remains limited, and trials often do not fully capture patient-reported tolerability. Large-scale online peer-support narratives therefore offer a complementary lens on how patients describe and evaluate medications in real-world use. In this study, we curated a corpus of 5,059 Reddit posts explicitly referencing TRD from 3,480 subscribers across 28 mental health-related subreddits from 2010 to 2025. Of these, 3,839 posts mentioned at least one medication, yielding 23,399 mentions of 81 generic-name medications after lexicon-based normalization of brand names, misspellings, and colloquialisms. We developed an aspect-based sentiment classifier by fine-tuning DeBERTa-v3 on the SMM4H 2023 therapy-sentiment Twitter corpus with large language model based data augmentation, achieving a micro-F1 score of 0.800 on the shared-task test set. Applying this classifier to Reddit, we quantified sentiment toward individual medications across three categories: positive, neutral, and negative, and tracked patterns by drug, subscriber, subreddit, and year. Overall, 72.1% of medication mentions were neutral, 14.8% negative, and 13.1% positive. Conventional antidepressants, especially SSRIs and SNRIs, showed consistently higher negative than positive proportions, whereas ketamine and esketamine showed comparatively more favorable sentiment profiles. These findings show that normalized medication extraction combined with aspect-based sentiment analysis can help characterize patient-perceived treatment experiences in TRD-related Reddit discourse, complementing clinical evidence with large-scale patient-generated perspectives.
Problem

Research questions and friction points this paper is trying to address.

treatment-resistant depression
patient-reported outcomes
medication sentiment
real-world evidence
aspect-based sentiment analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-augmented data augmentation
aspect-based sentiment analysis
therapy normalization
treatment-resistant depression
DeBERTa-v3 fine-tuning
🔎 Similar Papers
No similar papers found.
Y
Yuxin Zhu
Department of Biomedical Informatics, Emory University, Atlanta, GA
S
Sahithi Lakamana
Department of Biomedical Informatics, Emory University, Atlanta
Masoud Rouhizadeh
Masoud Rouhizadeh
Assistant Professor, University of Florida
AI in HealthcareMedical InformaticsNatural Language ProcessingMachine LearningPopulation Health Informatics
S
Selen Bozkurt
Department of Biomedical Informatics, Emory University, Atlanta, GA
R
Rachel Hershenberg
Department of Psychiatry and Behavioral Sciences, Emory University, Atlanta, GA
Abeed Sarker
Abeed Sarker
Emory University School of Medicine
Natural Language ProcessingBiomedical InformaticsHealth Data ScienceApplied Machine Learning