Knowing When to Quit: Probabilistic Early Exits for Speech Separation

📅 2025-07-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Single-channel speech separation models for resource-constrained embedded devices lack adaptability to varying computational budgets. Method: This paper proposes an uncertainty-aware probabilistic early-exit mechanism that jointly models clean speech and residual variance, deriving an interpretable, fine-grained probabilistic exit criterion optimized for signal-to-noise ratio (SNR). The approach tightly integrates deep neural networks with a principled probabilistic framework, enabling multi-budget deployment within a single model via dynamic inference cost scaling. Contribution/Results: Experiments demonstrate that the model achieves or approaches state-of-the-art (SOTA) performance across diverse computational constraints, significantly improving the efficiency–accuracy trade-off in both speech separation and enhancement tasks. It provides a practical, adaptive solution deployable on heterogeneous edge devices.

Technology Category

Application Category

📝 Abstract
In recent years, deep learning-based single-channel speech separation has improved considerably, in large part driven by increasingly compute- and parameter-efficient neural network architectures. Most such architectures are, however, designed with a fixed compute and parameter budget, and consequently cannot scale to varying compute demands or resources, which limits their use in embedded and heterogeneous devices such as mobile phones and hearables. To enable such use-cases we design a neural network architecture for speech separation capable of early-exit, and we propose an uncertainty-aware probabilistic framework to jointly model the clean speech signal and error variance which we use to derive probabilistic early-exit conditions in terms of desired signal-to-noise ratios. We evaluate our methods on both speech separation and enhancement tasks, and we show that a single early-exit model can be competitive with state-of-the-art models trained at many compute and parameter budgets. Our framework enables fine-grained dynamic compute-scaling of speech separation networks while achieving state-of-the-art performance and interpretable exit conditions.
Problem

Research questions and friction points this paper is trying to address.

Enabling dynamic compute-scaling for speech separation networks
Designing uncertainty-aware early-exit conditions for variable resources
Achieving state-of-the-art performance with interpretable exit strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Early-exit neural network for speech separation
Uncertainty-aware probabilistic exit framework
Dynamic compute-scaling with interpretable conditions
🔎 Similar Papers
No similar papers found.
K
Kenny Falkær Olsen
Technical University of Denmark
M
Mads Østergaard
WS Audiology
K
Karl Ulbæk
WS Audiology
S
Søren Føns Nielsen
WS Audiology
R
Rasmus Malik Høegh Lindrup
WS Audiology
Bjørn Sand Jensen
Bjørn Sand Jensen
Associate Professor
Machine LearningSignal (image and audio) processing/anlaysis
Morten Mørup
Morten Mørup
Section for Cognitive Systems, Technical University of Denmark
Machine LearningNeuroimagingComplex NetworksBayesian Modeling