A Multi-stage Low-latency Enhancement System for Hearing Aids

📅 2025-08-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of simultaneously achieving ultra-low latency (≤5 ms) and high speech intelligibility in hearing aid applications, this paper proposes an end-to-end, multi-stage complex-domain speech enhancement system. Methodologically: (i) an asymmetric time–frequency window is designed to satisfy stringent latency constraints; (ii) speech magnitude and phase are jointly modeled in the complex spectrogram domain, with head-pose information incorporated to assist acoustic source separation; and (iii) a HASPI-guided post-processing module is introduced, specifically tailored to hearing aid gain characteristics. The key contribution lies in the first integration—within an ultra-low-latency architecture—of head-motion signals, complex-spectrogram modeling, and perception-driven post-processing. Evaluated on the ICASSP 2023 Clarity Challenge benchmark, the proposed system achieves significant improvements in HASPI scores, demonstrating its effectiveness and practicality for enhancing perceptual speech quality under challenging noisy conditions.

Technology Category

Application Category

📝 Abstract
This paper proposes an end-to-end system for the ICASSP 2023 Clarity Challenge. In this work, we introduce four major novelties: (1) a novel multi-stage system in both the magnitude and complex domains to better utilize phase information; (2) an asymmetric window pair to achieve higher frequency resolution with the 5ms latency constraint; (3) the integration of head rotation information and the mixture signals to achieve better enhancement; (4) a post-processing module that achieves higher hearing aid speech perception index (HASPI) scores with the hearing aid amplification stage provided by the baseline system.
Problem

Research questions and friction points this paper is trying to address.

Enhance hearing aid performance with multi-stage processing
Achieve low-latency audio under 5ms constraints
Improve speech perception using head rotation data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-stage system in magnitude and complex domains
Asymmetric window pair for frequency resolution
Integration of head rotation and mixture signals
🔎 Similar Papers
No similar papers found.
C
Chengwei Ouyang
Orka Inc.
K
Kexin Fei
Orka Inc.
H
Haoshuai Zhou
Orka Inc.
C
Congxi Lu
Orka Inc.
Linkai Li
Linkai Li
Head of Engineering, Orka Inc
Signal ProcessingSpeech EnhancementBiomedical Optics