Leveraging large language models and traditional machine learning ensembles for ADHD detection from narrative transcripts

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of automated ADHD diagnosis from psychiatric clinical narrative texts—complicated by linguistic ambiguity and contextual complexity—this study proposes the first tripartite ensemble framework integrating an open-source large language model (LLaMA3) with two interpretable traditional models: fine-tuned RoBERTa and TF-IDF–SVM. The framework jointly models long-range semantics and discriminative features via majority voting. Evaluated on 441 real-world clinical transcription samples for binary ADHD classification, it achieves an F₁-score of 0.71 (95% CI: [0.60–0.80]), with significantly higher recall than any individual constituent model. By synergizing deep semantic understanding with transparent, human-interpretable decision pathways, the approach demonstrates robustness and clinical applicability for text-based psychiatric disorder diagnosis—bridging the gap between performance and interpretability in mental health NLP.

Technology Category

Application Category

📝 Abstract
Despite rapid advances in large language models (LLMs), their integration with traditional supervised machine learning (ML) techniques that have proven applicability to medical data remains underexplored. This is particularly true for psychiatric applications, where narrative data often exhibit nuanced linguistic and contextual complexity, and can benefit from the combination of multiple models with differing characteristics. In this study, we introduce an ensemble framework for automatically classifying Attention-Deficit/Hyperactivity Disorder (ADHD) diagnosis (binary) using narrative transcripts. Our approach integrates three complementary models: LLaMA3, an open-source LLM that captures long-range semantic structure; RoBERTa, a pre-trained transformer model fine-tuned on labeled clinical narratives; and a Support Vector Machine (SVM) classifier trained using TF-IDF-based lexical features. These models are aggregated through a majority voting mechanism to enhance predictive robustness. The dataset includes 441 instances, including 352 for training and 89 for validation. Empirical results show that the ensemble outperforms individual models, achieving an F$_1$ score of 0.71 (95% CI: [0.60-0.80]). Compared to the best-performing individual model (SVM), the ensemble improved recall while maintaining competitive precision. This indicates the strong sensitivity of the ensemble in identifying ADHD-related linguistic cues. These findings demonstrate the promise of hybrid architectures that leverage the semantic richness of LLMs alongside the interpretability and pattern recognition capabilities of traditional supervised ML, offering a new direction for robust and generalizable psychiatric text classification.
Problem

Research questions and friction points this paper is trying to address.

Detecting ADHD from narrative transcripts using hybrid models
Integrating LLMs with traditional ML for psychiatric classification
Improving ADHD diagnosis accuracy through ensemble learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble of LLM and traditional ML models
Majority voting for enhanced predictive robustness
Hybrid architecture for psychiatric text classification
🔎 Similar Papers
No similar papers found.
Y
Yuxin Zhu
Emory University, Atlanta, Georgia, USA
Y
Yuting Guo
Emory University, Atlanta, Georgia, USA
N
Noah Marchuck
Emory University, Atlanta, Georgia, USA
Abeed Sarker
Abeed Sarker
Emory University School of Medicine
Natural Language ProcessingBiomedical InformaticsHealth Data ScienceApplied Machine Learning
Y
Yun Wang
Emory University, Atlanta, Georgia, USA