Towards Event-Robust Acoustic Scene Classification

πŸ“… 2026-06-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Current acoustic scene classification (ASC) systems exhibit insufficient robustness when confronted with unknown sound events in real-world environments. To address this limitation, this work introduces the first ASC evaluation benchmark specifically designed for event-level robustness, presenting the Event-Shifted Acoustic Scenes (ESAS) dataset. ESAS is constructed by leveraging large language models to guide the injection of foreground sound events into background scenes, thereby simulating realistic acoustic perturbations. This approach enables more challenging data synthesis, a refined evaluation protocol, and a standardized benchmark design. Experimental results demonstrate a significant performance drop of existing ASC models on ESAS, revealing their vulnerability to unseen sound events and establishing a new direction and standardized testbed for advancing robustness in acoustic scene classification.
πŸ“ Abstract
This paper introduces the Event-Shifted Acoustic Scene (ESAS) dataset, a novel benchmark for evaluating the robustness of Acoustic Scene Classification (ASC) systems against unknown sound events. Existing ASC datasets typically contain recordings of clean and consistent audio, while real-world environments often include diverse and unexpected sound events. To bridge this gap, ESAS simulates real-world acoustic variability by injecting foreground sound events into background scenes with the assistance of large language models. In this work, we present the construction methodology, dataset statistics, and evaluation protocols. Furthermore, a comprehensive evaluation of state-of-the-art ASC systems is conducted using the ESAS benchmark. Experimental results reveal that existing ASC models suffer significant performance degradation when facing the event-shift challenge. The introduction of the ESAS dataset aims to drive future research toward event-robust ASC.
Problem

Research questions and friction points this paper is trying to address.

Acoustic Scene Classification
robustness
unknown sound events
event-shift
real-world acoustic variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Event-Shifted Acoustic Scene
Acoustic Scene Classification
Robustness
Sound Event Injection
Large Language Models