Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

177K/year
🤖 AI Summary
This study addresses the labor-intensive challenge of mapping free-text psychiatric clinical notes to ICD codes by systematically evaluating the performance of automatic classification models ranging from traditional term-frequency approaches (e.g., Bag-of-Words, TF-IDF) to state-of-the-art large language models (e.g., e5_large, BioLORD, Llama-3-8B) on a dataset of 145,513 Spanish-language clinical records. It presents the first comprehensive comparison between classical NLP methods and large language models in the psychiatric domain and demonstrates that fine-tuning large models on clinical terminology is crucial for mitigating challenges posed by long-tailed label distributions and semantic ambiguity. Among all models tested, e5_large achieved the best performance after end-to-end fine-tuning, attaining a micro F1-score of 0.866—significantly outperforming conventional approaches.
📝 Abstract
Mental health has become a global priority, leading to a massive administrative burden in the coding of clinical diagnoses. This study proposes the automation of psychiatric diagnostic analysis by mapping free-text descriptions to the International Classification of Diseases (ICD) using Natural Language Processing (NLP) and Machine Learning (ML) techniques. Utilizing a specialized dataset of 145,513 Spanish psychiatric descriptions, various text representation paradigms were evaluated, ranging from classical frequency-based models (BoW, TF-IDF) to state-of-the-art Large Language Models (LLMs) such as e5\_large, BioLORD, and Llama-3-8B. Results indicate that transformer-based embeddings consistently outperform traditional methods by capturing implicit semantic cues and nuanced medical terminology. The e5\_large model, through end-to-end fine-tuning, achieved the highest performance with a $F1_{micro}$ score of 0.866. This research demonstrates that adapting LLMs to specific clinical nomenclature is essential for overcoming the challenges of ``long-tail'' label distributions and the inherent ambiguity of psychiatric discourse.
Problem

Research questions and friction points this paper is trying to address.

ICD classification
psychiatric diagnosis
automated coding
clinical text
mental health
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
ICD Classification
Psychiatric Diagnosis
Transformer Embeddings
Clinical NLP
🔎 Similar Papers
Fernando Ortega
Fernando Ortega
Dpto. Sistemas Informáticos, ETSI de Sistemas Informáticos, Universidad Politécnica de Madrid
Recommender SystemsCollaborative FilteringMachine LearningSocial Networks
R
Raúl Lara-Cabrera
Department of Sistemas Informáticos, Universidad Politécnica de Madrid, Spain
J
Jorge Dueñas-Lerín
Department of Sistemas Informáticos, Universidad Politécnica de Madrid, Spain
A
Alejandro de la Torre-Luque
Department of Legal Medicine, Psychiatry and Pathology. Complutense University of Madrid, Spain
M
Mercè Salvador Robert
Hospital Universitario de Móstoles, Universidad Rey Juan Carlos, Spain
E
Enrique Baca-García
Department of Psychiatry, University Hospital Jimenez Díaz Fundation, Madrid, Spain