Aligning Language Models with Clinical Expertise: DPO for Heart Failure Nursing Documentation in Critical Care

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Clinical documentation of heart failure care in intensive care units (ICUs) suffers from inconsistent terminology, informal phrasing, and lack of standardization—impeding clinical decision-making and patient safety. To address this, we propose the first application of Direct Preference Optimization (DPO) to generate standardized heart failure nursing notes. Leveraging real-world MIMIC-III records and clinically validated preference pairs—synthesized by GPT and rigorously reviewed by domain experts—we fine-tune a lightweight Mistral-7B model under strict privacy-preserving constraints. Our method significantly improves documentation quality: BLEU score increases by 84%, BERTScore rises by 7.6%, and expert evaluations demonstrate statistically significant improvements in accuracy, completeness, and logical coherence (p < 0.01). This work establishes a scalable, empirically verifiable preference optimization framework for standardizing clinical documentation.

Technology Category

Application Category

📝 Abstract

Nursing documentation in intensive care units (ICUs) provides essential clinical intelligence but often suffers from inconsistent terminology, informal styles, and lack of standardization, challenges that are particularly critical in heart failure care. This study applies Direct Preference Optimization (DPO) to adapt Mistral-7B, a locally deployable language model, using 8,838 heart failure nursing notes from the MIMIC-III database and 21,210 preference pairs derived from expert-verified GPT outputs, model generations, and original notes. Evaluation across BLEU, ROUGE, BERTScore, Perplexity, and expert qualitative assessments demonstrates that DPO markedly enhances documentation quality. Specifically, BLEU increased by 84% (0.173 to 0.318), BERTScore improved by 7.6% (0.828 to 0.891), and expert ratings rose across accuracy (+14.4 points), completeness (+14.5 points), logical consistency (+14.1 points), readability (+11.1 points), and structural clarity (+6.0 points). These results indicate that DPO can align lightweight clinical language models with expert standards, supporting privacy-preserving, AI-assisted documentation within electronic health record systems to reduce administrative burden and improve ICU patient safety.

Problem

Research questions and friction points this paper is trying to address.

Improving inconsistent terminology in ICU nursing documentation

Standardizing informal styles in heart failure care records

Enhancing clinical documentation quality using expert-aligned language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Applied Direct Preference Optimization to Mistral-7B model

Used expert-verified preference pairs from nursing notes

Enhanced documentation quality with privacy-preserving deployment

🔎 Similar Papers

No similar papers found.