Deep learning based spatial aliasing reduction in beamforming for audio capture

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Spatial aliasing in spaced microphone arrays causes directional ambiguity at high frequencies, severely degrading the spatial resolution and spectral fidelity of beamforming. To address this, we propose a U-Net–based signal-adaptive anti-aliasing filtering method—the first to learn signal-dependent, multichannel anti-aliasing filters via U-Net. Our architecture incorporates two parallel filtering branches: one models single-channel time-frequency characteristics, while the other explicitly encodes inter-channel spatial correlations. We further employ deep supervision and joint modeling of stereo and first-order Ambisonics representations. Experiments demonstrate that the proposed method significantly outperforms conventional beamformers in both objective metrics (e.g., SDR, SIR) and subjective listening quality. It effectively suppresses aliasing artifacts, improves spatial localization accuracy, and enhances spectral reconstruction fidelity across broadband frequencies.

Technology Category

Application Category

📝 Abstract
Spatial aliasing affects spaced microphone arrays, causing directional ambiguity above certain frequencies, degrading spatial and spectral accuracy of beamformers. Given the limitations of conventional signal processing and the scarcity of deep learning approaches to spatial aliasing mitigation, we propose a novel approach using a U-Net architecture to predict a signal-dependent de-aliasing filter, which reduces aliasing in conventional beamforming for spatial capture. Two types of multichannel filters are considered, one which treats the channels independently and a second one that models cross-channel dependencies. The proposed approach is evaluated in two common spatial capture scenarios: stereo and first-order Ambisonics. The results indicate a very significant improvement, both objective and perceptual, with respect to conventional beamforming. This work shows the potential of deep learning to reduce aliasing in beamforming, leading to improvements in multi-microphone setups.
Problem

Research questions and friction points this paper is trying to address.

Reducing spatial aliasing in beamforming for audio capture
Addressing directional ambiguity in spaced microphone arrays
Improving spatial and spectral accuracy with deep learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

U-Net predicts signal-dependent de-aliasing filter
Multichannel filters model cross-channel dependencies
Improves beamforming in stereo and Ambisonics
🔎 Similar Papers
No similar papers found.
M
Mateusz Guzik
Institute of Electronics, AGH University of Krakow, Poland
G
Giulio Cengarle
Dolby Laboratories, Barcelona, Spain
Daniel Arteaga
Daniel Arteaga
Dolby Labs & Universitat Pompeu Fabra
Spatial audiomachine learningacousticstheoretical physicsgravitation