From the perspective of perceptual speech quality: The robustness of frequency bands to noise.

📅 2024-03-01

🏛️ Journal of the Acoustical Society of America

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

Prior work predominantly investigates the impact of frequency bands on speech intelligibility, yet systematic analysis of band-level noise robustness with respect to *perceived* speech quality remains lacking. This paper bridges that gap by introducing a perceptually grounded evaluation framework: leveraging a MUSHRA-inspired subjective assessment paradigm, 32-channel constant-Q transform (CQT) filtering, real-world noise types, and multiple signal-to-noise ratios (SNRs), we quantify the noise robustness of individual frequency bands—formalizing the “Bandwise Noise Robustness Index” (BNRI). Our experiments reveal that the mid-frequency band (≈500–2000 Hz) exhibits the most pronounced degradation in perceived quality under noise, indicating its lowest robustness. This finding identifies a critical sensitivity region for perceptual speech quality optimization and provides an interpretable, perception-aligned foundation for band-weighted enhancement algorithms targeting quality-aware speech processing.

Technology Category

Application Category

📝 Abstract

Speech quality is one of the main foci of speech-related research, where it is frequently studied with speech intelligibility, another essential measurement. Band-level perceptual speech intelligibility, however, has been studied frequently, whereas speech quality has not been thoroughly analyzed. In this paper, a Multiple Stimuli With Hidden Reference and Anchor (MUSHRA) inspired approach was proposed to study the individual robustness of frequency bands to noise with perceptual speech quality as the measure. Speech signals were filtered into thirty-two frequency bands with compromising real-world noise employed at different signal-to-noise ratios. Robustness to noise indices of individual frequency bands was calculated based on the human-rated perceptual quality scores assigned to the reconstructed noisy speech signals. Trends in the results suggest the mid-frequency region appeared less robust to noise in terms of perceptual speech quality. These findings suggest future research aiming at improving speech quality should pay more attention to the mid-frequency region of the speech signals accordingly.

Problem

Research questions and friction points this paper is trying to address.

Analyzing frequency band robustness to noise using perceptual speech quality

Investigating mid-frequency vulnerability to noise in speech signals

Developing MUSHRA-inspired method to assess band-level speech quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

MUSHRA-inspired approach evaluates frequency band robustness

Calculated noise robustness indices from human perceptual scores

Identified mid-frequency regions as less noise resistant

🔎 Similar Papers

No similar papers found.

Authors to Follow