Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization

📅 2025-01-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the low spatial upsampling accuracy and difficulty in personalization under sparse HRTF measurements (only 3–5 directions), this paper proposes Retrieval-Augmented Neural Fields (RANF). RANF introduces a novel “retrieve–fuse” paradigm: it first retrieves multi-directional HRTFs of semantically similar subjects from a large-scale HRTF database, then aligns cross-subject features and fuses multi-channel implicit fields via a transform-average-concatenate architecture. Built upon neural implicit field modeling, RANF integrates HRTF semantic matching and fine-tuning on the SONICOM dataset to achieve high-fidelity spatial upsampling and subject-specific adaptation from minimal samples. As the core component of the winning solution in Task 2 of the 2024 Listener Acoustic Personalization Challenge, RANF significantly improves upsampling accuracy. It establishes a scalable, highly generalizable framework for low-sampling-rate HRTF modeling.

Technology Category

Application Category

📝 Abstract

Head-related transfer functions (HRTFs) with dense spatial grids are desired for immersive binaural audio generation, but their recording is time-consuming. Although HRTF spatial upsampling has shown remarkable progress with neural fields, spatial upsampling only from a few measured directions, e.g., 3 or 5 measurements, is still challenging. To tackle this problem, we propose a retrieval-augmented neural field (RANF). RANF retrieves a subject whose HRTFs are close to those of the target subject from a dataset. The HRTF of the retrieved subject at the desired direction is fed into the neural field in addition to the sound source direction itself. Furthermore, we present a neural network that can efficiently handle multiple retrieved subjects, inspired by a multi-channel processing technique called transform-average-concatenate. Our experiments confirm the benefits of RANF on the SONICOM dataset, and it is a key component in the winning solution of Task 2 of the listener acoustic personalization challenge 2024.

Problem

Research questions and friction points this paper is trying to address.

HRTF Processing

Sound Detail Enhancement

Surround Sound Effect Creation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Enhanced Neural Fields

Sound Personalization

HRTF Enhancement

🔎 Similar Papers

No similar papers found.

Authors to Follow