Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This study addresses the labor-intensive challenge of mapping free-text psychiatric clinical notes to ICD codes by systematically evaluating the performance of automatic classification models ranging from traditional term-frequency approaches (e.g., Bag-of-Words, TF-IDF) to state-of-the-art large language models (e.g., e5_large, BioLORD, Llama-3-8B) on a dataset of 145,513 Spanish-language clinical records. It presents the first comprehensive comparison between classical NLP methods and large language models in the psychiatric domain and demonstrates that fine-tuning large models on clinical terminology is crucial for mitigating challenges posed by long-tailed label distributions and semantic ambiguity. Among all models tested, e5_large achieved the best performance after end-to-end fine-tuning, attaining a micro F1-score of 0.866—significantly outperforming conventional approaches.

📝 Abstract

Mental health has become a global priority, leading to a massive administrative burden in the coding of clinical diagnoses. This study proposes the automation of psychiatric diagnostic analysis by mapping free-text descriptions to the International Classification of Diseases (ICD) using Natural Language Processing (NLP) and Machine Learning (ML) techniques. Utilizing a specialized dataset of 145,513 Spanish psychiatric descriptions, various text representation paradigms were evaluated, ranging from classical frequency-based models (BoW, TF-IDF) to state-of-the-art Large Language Models (LLMs) such as e5\_large, BioLORD, and Llama-3-8B. Results indicate that transformer-based embeddings consistently outperform traditional methods by capturing implicit semantic cues and nuanced medical terminology. The e5\_large model, through end-to-end fine-tuning, achieved the highest performance with a $F1_{micro}$ score of 0.866. This research demonstrates that adapting LLMs to specific clinical nomenclature is essential for overcoming the challenges of ``long-tail'' label distributions and the inherent ambiguity of psychiatric discourse.

Problem

Research questions and friction points this paper is trying to address.

ICD classification

psychiatric diagnosis

automated coding

clinical text

mental health

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

ICD Classification

Psychiatric Diagnosis