🤖 AI Summary
This study addresses the challenge of automatically extracting and modeling patients’ perceptions of physicians’ personality traits from online reviews to enhance trust and satisfaction in patient–physician relationships.
Method: Leveraging 4.1 million nationwide patient reviews, we developed the first national-scale, large language model (LLM)-driven interpretable analytics pipeline, integrating multi-model comparison, expert-annotated benchmarks, and clustering analysis to systematically identify physicians’ Big Five personality traits.
Contribution/Results: We identified four stable physician archetypes exhibiting significant specialty-specific distributions and gender-perception disparities. LLM-derived trait scores show high agreement with human annotation (r = 0.72–0.89) and strong correlations with patient satisfaction (r = 0.41–0.81). This work establishes a novel, empirically grounded paradigm for healthcare quality assessment and interpersonal relationship optimization in clinical settings.
📝 Abstract
Understanding how patients perceive their physicians is essential to improving trust, communication, and satisfaction. We present a large language model (LLM)-based pipeline that infers Big Five personality traits and five patient-oriented subjective judgments. The analysis encompasses 4.1 million patient reviews of 226,999 U.S. physicians from an initial pool of one million. We validate the method through multi-model comparison and human expert benchmarking, achieving strong agreement between human and LLM assessments (correlation coefficients 0.72-0.89) and external validity through correlations with patient satisfaction (r = 0.41-0.81, all p<0.001). National-scale analysis reveals systematic patterns: male physicians receive higher ratings across all traits, with largest disparities in clinical competence perceptions; empathy-related traits predominate in pediatrics and psychiatry; and all traits positively predict overall satisfaction. Cluster analysis identifies four distinct physician archetypes, from "Well-Rounded Excellent" (33.8%, uniformly high traits) to "Underperforming" (22.6%, consistently low). These findings demonstrate that automated trait extraction from patient narratives can provide interpretable, validated metrics for understanding physician-patient relationships at scale, with implications for quality measurement, bias detection, and workforce development in healthcare.