DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Audio classifiers suffer from domain shift under acoustic environmental variations, yet existing test-time adaptation (TTA) studies predominantly evaluate performance under static or mismatched noise conditions, failing to model the diversity of real-world degradations. To address this, we propose DHAuDS—a novel, dynamic heterogeneous audio degradation benchmark specifically designed for audio TTA evaluation. Built upon four core datasets including UrbanSound8K-C, DHAuDS synthesizes degraded samples via dynamically modulated intensity control and multi-type noise superposition. It establishes four standardized benchmarks, introduces 14 differentiated evaluation metrics, and defines dynamic mixed-domain noise configurations. We conduct 124 reproducible experiments. As the first systematic framework for audio TTA, DHAuDS enables fair, cross-domain, and robust assessment of TTA methods under diverse, realistic audio degradations—substantially enhancing the comprehensiveness and credibility of audio model generalization evaluation.

Technology Category

Application Category

📝 Abstract

Audio classifiers frequently face domain shift, when models trained on one dataset lose accuracy on data recorded in acoustically different conditions. Previous Test-Time Adaptation (TTA) research in speech and sound analysis often evaluates models under fixed or mismatched noise settings, that fail to mimic real-world variability. To overcome these limitations, this paper presents DHAuDS (Dynamic and Heterogeneous Audio Domain Shift), a benchmark designed to assess TTA approaches under more realistic and diverse acoustic shifts. DHAuDS comprises four standardized benchmarks: UrbanSound8K-C, SpeechCommandsV2-C, VocalSound-C, and ReefSet-C, each constructed with dynamic corruption severity levels and heterogeneous noise types to simulate authentic audio degradation scenarios. The framework defines 14 evaluation criteria for each benchmark (8 for UrbanSound8K-C), resulting in 50 unrepeated criteria (124 experiments) that collectively enable fair, reproducible, and cross-domain comparison of TTA algorithms. Through the inclusion of dynamic and mixed-domain noise settings, DHAuDS offers a consistent and publicly reproducible testbed to support ongoing studies in robust and adaptive audio modeling.

Problem

Research questions and friction points this paper is trying to address.

Addresses audio classifier accuracy loss from domain shifts

Evaluates test-time adaptation under realistic acoustic variations

Provides standardized benchmarks for reproducible cross-domain comparisons

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic heterogeneous audio benchmark for domain shift

Standardized benchmarks with dynamic corruption severity

Four benchmarks with 50 criteria for TTA evaluation

🔎 Similar Papers

Personalized Speech Recognition for Children with Test-Time Adaptation

2024-09-19arXiv.orgCitations: 0