PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing patient simulators struggle to balance clinical authenticity with personality diversity, hindering the training and evaluation of large language models (LLMs) in multi-turn, context-aware doctor–patient dialogues. Method: We propose the first clinically grounded, four-dimensional persona modeling framework—encompassing personality traits, linguistic proficiency, medical history recall fidelity, and cognitive status—generating 37 composable, realistic patient personas from MIMIC-ED/IV real-world emergency department and intensive care data. Our approach integrates clinical knowledge graphs, multi-dimensional persona prompt engineering, and Llama 3.3–driven dialogue generation, validated by domain-expert physicians. Contribution/Results: Evaluated across eight state-of-the-art LLMs, our framework achieves high factual accuracy and persona consistency; blinded assessments by four clinicians confirm high clinical fidelity. The open-source, privacy-compliant system supports customizable medical education and standardized benchmarking—marking the first solution unifying clinical realism with scalable personality diversity.

Technology Category

Application Category

📝 Abstract
Doctor-patient consultations require multi-turn, context-aware communication tailored to diverse patient personas. Training or evaluating doctor LLMs in such settings requires realistic patient interaction systems. However, existing simulators often fail to reflect the full range of personas seen in clinical practice. To address this, we introduce PatientSim, a patient simulator that generates realistic and diverse patient personas for clinical scenarios, grounded in medical expertise. PatientSim operates using: 1) clinical profiles, including symptoms and medical history, derived from real-world data in the MIMIC-ED and MIMIC-IV datasets, and 2) personas defined by four axes: personality, language proficiency, medical history recall level, and cognitive confusion level, resulting in 37 unique combinations. We evaluated eight LLMs for factual accuracy and persona consistency. The top-performing open-source model, Llama 3.3, was validated by four clinicians to confirm the robustness of our framework. As an open-source, customizable platform, PatientSim provides a reproducible and scalable solution that can be customized for specific training needs. Offering a privacy-compliant environment, it serves as a robust testbed for evaluating medical dialogue systems across diverse patient presentations and shows promise as an educational tool for healthcare.
Problem

Research questions and friction points this paper is trying to address.

Simulating diverse patient personas for realistic doctor-patient interactions
Addressing limitations of existing patient simulators in clinical practice
Evaluating LLMs for factual accuracy and persona consistency in medical dialogues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses clinical profiles from MIMIC datasets
Defines personas via four key axes
Validated by clinicians for robustness
🔎 Similar Papers
No similar papers found.
D
Daeun Kyung
KAIST
H
Hyunseung Chung
KAIST
S
Seongsu Bae
KAIST
Jiho Kim
Jiho Kim
Ph.d student, KAIST
Computer Architecture
T
Taerim Kim
Samsung Medical Center
S
Soo Kyung Kim
Ewha Womans University
Edward Choi
Edward Choi
KAIST
Machine LearningArtificial IntelligenceHealthcare