Empathy by Design: Aligning Large Language Models for Healthcare Dialogue

📅 2025-12-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
General-purpose large language models (LLMs) exhibit two critical deficiencies in medical dialogue: factual unreliability and empathic inadequacy—undermining trust and safety for non-professional users and caregivers. To address this, we propose a domain-specific direct preference optimization (DPO) alignment framework tailored for healthcare contexts. Our approach leverages a domain-adapted base model and carefully curated pairwise preference data that explicitly favors supportive, layperson-accessible language while penalizing technical jargon and directive tones—thereby jointly improving factual accuracy and empathic communication. Unlike reinforcement learning–based methods, our DPO framework avoids complex reward modeling and policy optimization, offering superior scalability and interpretability. Extensive experiments demonstrate that the aligned model significantly outperforms baseline LLMs and commercial systems (e.g., Google’s medical chatbot) across semantic coherence, factual correctness, and human-rated empathy scores.

Technology Category

Application Category

📝 Abstract
General-purpose large language models (LLMs) have demonstrated remarkable generative and reasoning capabilities but remain limited in healthcare and caregiving applications due to two key deficiencies: factual unreliability and a lack of empathetic communication. These shortcomings pose significant risks in sensitive contexts where users, particularly non-professionals and caregivers, seek medically relevant guidance or emotional reassurance. To address these challenges, we introduce a Direct Preference Optimization (DPO)-based alignment framework designed to improve factual correctness, semantic coherence, and human-centric qualities such as empathy, politeness, and simplicity in caregiver-patient dialogues. Our approach fine-tunes domain-adapted LLMs using pairwise preference data, where preferred responses reflect supportive and accessible communication styles while rejected ones represent prescriptive or overly technical tones. This direct optimization method aligns model outputs with human preferences more efficiently than traditional reinforcement-learning-based alignment. Empirical evaluations across multiple open and proprietary LLMs show that our DPO-tuned models achieve higher semantic alignment, improved factual accuracy, and stronger human-centric evaluation scores compared to baseline and commercial alternatives such as Google medical dialogue systems. These improvements demonstrate that preference-based alignment offers a scalable and transparent pathway toward developing trustworthy, empathetic, and clinically informed AI assistants for caregiver and healthcare communication. Our open-source code is available at: https://github.com/LeonG19/Empathy-by-Design
Problem

Research questions and friction points this paper is trying to address.

Improves factual accuracy and empathy in healthcare AI dialogues
Addresses unreliable and non-empathetic communication in LLMs for caregiving
Enhances human-centric qualities like politeness and simplicity in medical conversations
Innovation

Methods, ideas, or system contributions that make the work stand out.

DPO-based alignment framework for healthcare dialogue
Fine-tunes LLMs with pairwise preference data
Improves factual accuracy and empathetic communication
🔎 Similar Papers
No similar papers found.
E
Emre Umucu
Department of Public Health Sciences, The University of Texas at El Paso, USA
G
Guillermina Solis
Department of Nursing, The University of Texas at El Paso, USA
L
Leon Garza
Department of Computer Science, The University of Texas at El Paso, USA
E
Emilia Rivas
Department of Computer Science, The University of Texas at El Paso, USA
B
Beatrice Lee
Department of Rehabilitation Sciences, The University of Texas at El Paso, USA
A
Anantaa Kotal
Department of Computer Science, The University of Texas at El Paso, USA
Aritran Piplai
Aritran Piplai
The University of Texas at El Paso
Artificial intelligenceKnowledge extractioncyber security