🤖 AI Summary
To address domain shift induced by background noise in audio classification, this paper proposes CoNMix, a noise-robust test-time adaptation (TTA) method. Unlike existing test-time training (TTT) and test-time entropy minimization (TENT) approaches, CoNMix is the first TTA framework tailored to audio classification under noisy domain shifts, dynamically adapting model parameters during inference using unlabeled test samples. Evaluated on AudioMNIST and SpeechCommands under diverse noise types and signal-to-noise ratios, CoNMix consistently outperforms baseline methods, achieving a minimum error rate of 5.31% on AudioMNIST. These results demonstrate its strong generalization capability and practical efficacy. This work establishes a novel paradigm and an effective technical pathway for TTA in audio classification, advancing robustness to real-world acoustic perturbations.
📝 Abstract
Domain shift is a prominent problem in Deep Learning, causing a model pre-trained on a source dataset to suffer significant performance degradation on test datasets. This research aims to address the issue of audio classification under domain shift caused by background noise using Test-Time Adaptation (TTA), a technique that adapts a pre-trained model during testing using only unlabelled test data before making predictions. We adopt two common TTA methods, TTT and TENT, and a state-of-the-art method CoNMix, and investigate their respective performance on two popular audio classification datasets, AudioMNIST (AM) and SpeechCommands V1 (SC), against different types of background noise and noise severity levels. The experimental results reveal that our proposed modified version of CoNMix produced the highest classification accuracy under domain shift (5.31% error rate under 10 dB exercise bike background noise and 12.75% error rate under 3 dB running tap background noise for AM) compared to TTT and TENT. The literature search provided no evidence of similar works, thereby motivating the work reported here as the first study to leverage TTA techniques for audio classification under domain shift.