🤖 AI Summary
This study addresses the lack of objective, real-time assessment tools for team leadership in pediatric intensive care units (PICUs). We propose the first first-person vision–based multimodal automated leadership assessment framework. Using Aria Glasses, we simultaneously capture egocentric video, audio, eye-tracking, and head-motion data. These are processed via REMoDNaV for eye-movement decoding, SAM/YOLO for object segmentation, and ChatGPT-enhanced dialogue analysis to quantify behavioral metrics—including fixation targets, eye contact frequency, and verbal directive patterns. We further construct an interpretable behavior-to-leadership competency mapping model. Our approach uniquely integrates egocentric perception with large language model–assisted analysis to enable fine-grained, traceable leadership evaluation in high-stakes clinical simulations. In four simulated PICU scenarios, key metrics—including fixation duration, gaze transition patterns, and direct verbal orders—showed significant correlation with expert leadership ratings (p < 0.01), and our model improved prediction accuracy by 32% over baseline methods.
📝 Abstract
This paper addresses the task of assessing PICU team's leadership skills by developing an automated analysis framework based on egocentric vision. We identify key behavioral cues, including fixation object, eye contact, and conversation patterns, as essential indicators of leadership assessment. In order to capture these multimodal signals, we employ Aria Glasses to record egocentric video, audio, gaze, and head movement data. We collect one-hour videos of four simulated sessions involving doctors with different roles and levels. To automate data processing, we propose a method leveraging REMoDNaV, SAM, YOLO, and ChatGPT for fixation object detection, eye contact detection, and conversation classification. In the experiments, significant correlations are observed between leadership skills and behavioral metrics, i.e., the output of our proposed methods, such as fixation time, transition patterns, and direct orders in speech. These results indicate that our proposed data collection and analysis framework can effectively solve skill assessment for training PICU teams.