Audio Spotforming via Post-Filtering Using Cross-Array Non-target Estimates

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Traditional acoustic focusing methods rely on low-rank approximations for post-filtering, which struggle to accurately capture the complex structure of speech signals and thereby limit target speech extraction performance. This work proposes a post-filtering approach based on cross-array estimation of non-target components, leveraging the spatial separability of interference across multiple microphone arrays. By estimating interfering components within the target direction from other arrays, the method constructs a post-filter without relying on low-rank assumptions. This strategy better aligns with the intrinsic characteristics of speech signals and significantly enhances extraction accuracy while preserving spatial filtering properties. Experimental results demonstrate that the proposed method substantially outperforms existing acoustic focusing techniques in terms of speech extraction performance.

📝 Abstract

Audio spotforming is a technique for extracting target speech from noisy mixtures by utilizing multiple microphone arrays. Conventional methods estimate a shared target speech component from linearly separated signals obtained by each array using low-rank approximations and apply post filtering (PF) based on this estimated low-rank representation. However, owing to the mismatch between low-rank models and the complex structure of speech signals, directly relying on low-rank approximations for PF can degrade the speech extraction performance. In this study, we leverage the observation that non-target components located in the target speech direction from the perspective of one array can be spatially separated when viewed from other arrays. This insight motivates a new spotforming method for efficient post-filter estimation using non-target estimates across arrays instead of relying on low-rank approximations. Experiments demonstrate that the proposed method outperforms conventional spotforming methods.

Problem

Research questions and friction points this paper is trying to address.

audio spotforming

post-filtering

low-rank approximation

speech extraction

multi-array processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

audio spotforming

post-filtering

cross-array