🤖 AI Summary
This study investigates how avatars’ head movements affect listeners’ communicative behavior, sense of presence, and dialogue success in triadic conversational virtual reality (VR) settings. To address this, we developed a low-latency audiovisual and head-motion synchronization system integrating physical-sensor-based motion capture with real-time avatar rendering. For the first time, we systematically compared three conditions in naturalistic three-party conversations: (1) real-time head-motion mapping, (2) speech-triggered head animation, and (3) static avatars. Results demonstrate that real-time head-motion mapping significantly enhances speech rhythm coordination, head orientation diversity, subjective presence, and dialogue success rate—most notably improving perceived “conversational helpfulness.” These findings empirically validate the critical role of high-fidelity head-motion feedback in enabling natural multiparty interaction in VR. The work provides both empirical evidence and a technical framework for designing expressive, behaviorally grounded avatar animations in social VR systems.
📝 Abstract
Interactive communication in virtual reality can be used in experimental paradigms to increase the ecological validity of hearing device evaluations. This requires the virtual environment to elicit natural communication behaviour in listeners. This study evaluates the effect of virtual animated characters' head movements on participants' communication behaviour and experience. Triadic conversations were conducted between a test participant and two confederates. To facilitate the manipulation of head movements, the conversation was conducted in telepresence using a system that transmitted audio, head movement data and video with low delay. The confederates were represented by virtual animated characters (avatars) with different levels of animation: Static heads, automated head movement animations based on speech level onsets, and animated head movements based on the transmitted head movements of the interlocutors. A condition was also included in which the videos of the interlocutors' heads were embedded in the visual scene. The results show significant effects of animation level on the participants' speech and head movement behaviour as recorded by physical sensors, as well as on the subjective sense of presence and the success of the conversation. The largest effects were found for the range of head orientation during speech and the perceived realism of avatars. Participants reported that they were spoken to in a more helpful way when the avatars showed head movements transmitted from the interlocutors than when the avatars' heads were static. We therefore conclude that the representation of interlocutors must include sufficiently realistic head movements in order to elicit natural communication behaviour.