🤖 AI Summary
This study investigates the capability of large language models (LLMs) to identify 20 clinically relevant social signals—such as physician dominance and patient affiliation—solely from transcribed clinical dialogue.
Method: We introduce the first LLM-based framework capable of concurrently tracking all 20 manually annotated social behaviors, integrating task-specific prompt engineering, comparative evaluation across multiple architectures (GPT and Llama series), and clinical-context-aware prompt optimization, rigorously evaluated on highly imbalanced real-world clinical data.
Contribution/Results: Results demonstrate that LLMs can reliably infer nonverbal social behaviors from text alone; prompt design and domain adaptation to healthcare significantly improve classification accuracy; and systematic analysis reveals intrinsic contextual sensitivity patterns in model behavior. This work establishes a reproducible methodological foundation and empirical evidence for automated, fine-grained analysis of clinical communication.
📝 Abstract
Effective communication between providers and their patients influences health and care outcomes. The effectiveness of such conversations has been linked not only to the exchange of clinical information, but also to a range of interpersonal behaviors; commonly referred to as social signals, which are often conveyed through non-verbal cues and shape the quality of the patient-provider relationship. Recent advances in large language models (LLMs) have demonstrated an increasing ability to infer emotional and social behaviors even when analyzing only textual information. As automation increases also in clinical settings, such as for transcription of patient-provider conversations, there is growing potential for LLMs to automatically analyze and extract social behaviors from these interactions. To explore the foundational capabilities of LLMs in tracking social signals in clinical dialogue, we designed task-specific prompts and evaluated model performance across multiple architectures and prompting styles using a highly imbalanced, annotated dataset spanning 20 distinct social signals such as provider dominance, patient warmth, etc. We present the first system capable of tracking all these 20 coded signals, and uncover patterns in LLM behavior. Further analysis of model configurations and clinical context provides insights for enhancing LLM performance on social signal processing tasks in healthcare settings.