🤖 AI Summary
Low-cost, non-intrusive acoustic identification of unmanned aerial vehicles (UAVs) is critical for effective safety regulation but remains challenging under real-world noisy and variable environmental conditions.
Method: This paper proposes an end-to-end deep learning framework featuring a novel multi-granularity acoustic feature-level fusion mechanism—jointly modeling static spectral characteristics via MFCCs, dynamic time-frequency evolution via STFT spectrograms, and compact low-dimensional representations learned by an autoencoder. The architecture integrates CNNs for local spectral pattern extraction and RNNs for capturing long-range temporal dependencies.
Contribution/Results: Evaluated under cross-environmental noise conditions, the model achieves 98.51% accuracy in binary UAV classification and 97.11% in multi-class identification, demonstrating superior generalization and robustness. Its lightweight design and inference efficiency indicate strong potential for real-time deployment in practical UAV monitoring systems.
📝 Abstract
Unmanned aerial vehicles (UAVs), commonly known as drones, are increasingly used across diverse domains, including logistics, agriculture, surveillance, and defense. While these systems provide numerous benefits, their misuse raises safety and security concerns, making effective detection mechanisms essential. Acoustic sensing offers a low-cost and non-intrusive alternative to vision or radar-based detection, as drone propellers generate distinctive sound patterns. This study introduces AUDRON (AUdio-based Drone Recognition Network), a hybrid deep learning framework for drone sound detection, employing a combination of Mel-Frequency Cepstral Coefficients (MFCC), Short-Time Fourier Transform (STFT) spectrograms processed with convolutional neural networks (CNNs), recurrent layers for temporal modeling, and autoencoder-based representations. Feature-level fusion integrates complementary information before classification. Experimental evaluation demonstrates that AUDRON effectively differentiates drone acoustic signatures from background noise, achieving high accuracy while maintaining generalizability across varying conditions. AUDRON achieves 98.51 percent and 97.11 percent accuracy in binary and multiclass classification. The results highlight the advantage of combining multiple feature representations with deep learning for reliable acoustic drone detection, suggesting the framework's potential for deployment in security and surveillance applications where visual or radar sensing may be limited.