🤖 AI Summary
To address insufficient binaural audio reproduction accuracy for arbitrary microphone arrays—particularly wearable ones—under head rotation, this paper proposes an array-aware binaural rendering method. The core innovation lies in embedding the geometric characteristics of the microphone array into the HRTF preprocessing pipeline, enabling joint optimization of Ambisonics encoding and HRTF filtering to co-model array-specific acoustic responses and head-related spatial cues. The method maintains full compatibility with standard Ambisonics formats and supports real-time rendering. Objective evaluations demonstrate significant improvements over conventional approaches in azimuth error and spectral distortion metrics. Subjective listening tests confirm superior perceptual quality in spatial impression, timbral naturalness, and robustness to head motion. These results validate its suitability for high-fidelity spatial audio applications in dynamic environments such as VR and AR.
📝 Abstract
This work introduces a novel method for binaural reproduction from arbitrary microphone arrays, based on array-aware optimization of Ambisonics encoding through Head-Related Transfer Function (HRTF) pre-processing. The proposed approach integrates array-specific information into the HRTF processing pipeline, leading to improved spatial accuracy in binaural rendering. Objective evaluations demonstrate superior performance under simulated wearable-array and head rotations compared to conventional Ambisonics encoding method. A listening experiment further confirms that the method achieves significantly higher perceptual ratings in both timbre and spatial quality. Fully compatible with standard Ambisonics, the proposed method offers a practical solution for spatial audio rendering in applications such as virtual reality, augmented reality, and wearable audio capture.