🤖 AI Summary
Existing patient simulators struggle to balance clinical authenticity with personality diversity, hindering the training and evaluation of large language models (LLMs) in multi-turn, context-aware doctor–patient dialogues.
Method: We propose the first clinically grounded, four-dimensional persona modeling framework—encompassing personality traits, linguistic proficiency, medical history recall fidelity, and cognitive status—generating 37 composable, realistic patient personas from MIMIC-ED/IV real-world emergency department and intensive care data. Our approach integrates clinical knowledge graphs, multi-dimensional persona prompt engineering, and Llama 3.3–driven dialogue generation, validated by domain-expert physicians.
Contribution/Results: Evaluated across eight state-of-the-art LLMs, our framework achieves high factual accuracy and persona consistency; blinded assessments by four clinicians confirm high clinical fidelity. The open-source, privacy-compliant system supports customizable medical education and standardized benchmarking—marking the first solution unifying clinical realism with scalable personality diversity.
📝 Abstract
Doctor-patient consultations require multi-turn, context-aware communication tailored to diverse patient personas. Training or evaluating doctor LLMs in such settings requires realistic patient interaction systems. However, existing simulators often fail to reflect the full range of personas seen in clinical practice. To address this, we introduce PatientSim, a patient simulator that generates realistic and diverse patient personas for clinical scenarios, grounded in medical expertise. PatientSim operates using: 1) clinical profiles, including symptoms and medical history, derived from real-world data in the MIMIC-ED and MIMIC-IV datasets, and 2) personas defined by four axes: personality, language proficiency, medical history recall level, and cognitive confusion level, resulting in 37 unique combinations. We evaluated eight LLMs for factual accuracy and persona consistency. The top-performing open-source model, Llama 3.3, was validated by four clinicians to confirm the robustness of our framework. As an open-source, customizable platform, PatientSim provides a reproducible and scalable solution that can be customized for specific training needs. Offering a privacy-compliant environment, it serves as a robust testbed for evaluating medical dialogue systems across diverse patient presentations and shows promise as an educational tool for healthcare.