DroneAudioset: An Audio Dataset for Drone-based Search and Rescue

📅 2025-10-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Drone-based search and rescue (SAR) faces challenges including strong self-noise, scarcity of real-world audio data, and lack of standardized auditory evaluation protocols. Method: We introduce the first open-source, MIT-licensed drone auditory dataset tailored to realistic SAR scenarios, comprising 23.5 hours of multi-source, in-flight audio recordings across diverse UAV platforms, throttle levels, microphone configurations, and environmental conditions—spanning a wide SNR range—and annotated with fine-grained human speech labels and acoustic feature tags. Contribution/Results: Our dataset uniquely captures natural acoustic interactions between human vocalizations and drone noise under actual flight conditions—overcoming limitations of synthetic data distortion and insufficient diversity in prior datasets. It significantly enhances training robustness and evaluation fidelity of human presence detection models under extreme noise, providing a critical empirical foundation and standardized benchmark for developing drone auditory perception algorithms and hardware–algorithm co-design.

Technology Category

Application Category

📝 Abstract
Unmanned Aerial Vehicles (UAVs) or drones, are increasingly used in search and rescue missions to detect human presence. Existing systems primarily leverage vision-based methods which are prone to fail under low-visibility or occlusion. Drone-based audio perception offers promise but suffers from extreme ego-noise that masks sounds indicating human presence. Existing datasets are either limited in diversity or synthetic, lacking real acoustic interactions, and there are no standardized setups for drone audition. To this end, we present DroneAudioset (The dataset is publicly available at https://huggingface.co/datasets/ahlab-drone-project/DroneAudioSet/ under the MIT license), a comprehensive drone audition dataset featuring 23.5 hours of annotated recordings, covering a wide range of signal-to-noise ratios (SNRs) from -57.2 dB to -2.5 dB, across various drone types, throttles, microphone configurations as well as environments. The dataset enables development and systematic evaluation of noise suppression and classification methods for human-presence detection under challenging conditions, while also informing practical design considerations for drone audition systems, such as microphone placement trade-offs, and development of drone noise-aware audio processing. This dataset is an important step towards enabling design and deployment of drone-audition systems.
Problem

Research questions and friction points this paper is trying to address.

Dataset addresses drone ego-noise masking human sounds
Standardized drone audition setups are currently lacking
Existing audio datasets lack real acoustic diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dataset for drone audition with real acoustic recordings
Covers diverse SNRs, drone types, and environments
Enables noise suppression and human-presence detection methods
🔎 Similar Papers
No similar papers found.
Chitralekha Gupta
Chitralekha Gupta
Senior Research Fellow at National University of Singapore
Music Information RetrievalAudio Signal ProcessingMachine LearningDeep Learning
Soundarya Ramesh
Soundarya Ramesh
School of Computing, National University of Singapore
P
Praveen Sasikumar
School of Computing, National University of Singapore
K
Kian Peen Yeo
School of Computing, National University of Singapore
S
Suranga Nanayakkara
School of Computing, National University of Singapore