Synthetic data enables context-aware bioacoustic sound event detection

πŸ“… 2025-03-01
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address insufficient few-shot sound event detection capabilities in bioacoustics, this paper proposes a synthetic-data-driven query-based Transformer framework. We generate 8.8 thousand hours of strongly labeled audio via domain randomization, establishing the first publicly available few-shot bioacoustic benchmark covering 13 diverse tasks. Furthermore, we design a context-aware, training-free few-shot inference mechanism. Our work pioneers the use of synthetic data for pretraining foundation models in bioacoustics, significantly enhancing generalization to novel species and unseen environments. On few-shot detection benchmarks, our method achieves an average 49% improvement over state-of-the-art approaches. The model is deployed as an open API, enabling plug-and-play adoption by ecologists and behavioral scientists.

Technology Category

Application Category

πŸ“ Abstract
We propose a methodology for training foundation models that enhances their in-context learning capabilities within the domain of bioacoustic signal processing. We use synthetically generated training data, introducing a domain-randomization-based pipeline that constructs diverse acoustic scenes with temporally strong labels. We generate over 8.8 thousand hours of strongly-labeled audio and train a query-by-example, transformer-based model to perform few-shot bioacoustic sound event detection. Our second contribution is a public benchmark of 13 diverse few-shot bioacoustics tasks. Our model outperforms previously published methods by 49%, and we demonstrate that this is due to both model design and data scale. We make our trained model available via an API, to provide ecologists and ethologists with a training-free tool for bioacoustic sound event detection.
Problem

Research questions and friction points this paper is trying to address.

Enhance in-context learning for bioacoustic signal processing.
Generate synthetic data for diverse bioacoustic sound event detection.
Provide a public benchmark for few-shot bioacoustics tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic data enhances bioacoustic signal processing
Transformer-based model for few-shot sound detection
Public benchmark for diverse bioacoustics tasks
πŸ”Ž Similar Papers
No similar papers found.