🤖 AI Summary
This work addresses the challenge of limited feature generalizability in respiratory sound classification caused by variations in recording quality and class imbalance. To this end, the authors propose QLung, a novel framework that introduces, for the first time, a no-reference audio quality assessment metric based on spectral entropy and root-mean-square energy. This metric dynamically adjusts the angular margin in a normalized angular classifier and is combined with a log-scaling strategy to enhance intra-class compactness and inter-class separability, thereby stabilizing training. Evaluated on the ICBHI dataset, QLung achieves a 2.46% improvement over the cross-entropy baseline and demonstrates state-of-the-art out-of-distribution generalization performance on the SPRSound dataset.
📝 Abstract
We present a quality-adaptive angular-margin learning framework that improves feature generalization by enforcing intra-class compactness and inter-class separability. Our framework, titled QLung, introduces a no-reference audio quality margin derived from spectral entropy and root-mean-square energy, which adaptively scales angular margins based on recording quality. To this end, we propose a log-scaled angular margin that stabilizes training under severe class imbalance. We also use an angular classifier that normalizes features and class weights, ensuring margin penalties are applied consistently on the unit hypersphere. Our approach improves in-distribution performance on the ICBHI dataset by 2.46\% over the cross-entropy baseline, and most significantly, achieves the strongest out-of-distribution performance on the SPRSound dataset compared to prior state-of-the-art methods. Code is available at https://github.com/RSC-Toolkit/QLung.