Real-Time Emergency Vehicle Siren Detection with Efficient CNNs on Embedded Hardware

📅 2025-07-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-time, low-latency, and robust detection of emergency vehicle sirens in noisy urban environments remains challenging for resource-constrained edge devices. Method: We propose a lightweight acoustic event detection system tailored for embedded edge platforms. First, we construct AudioSet-EV—a high-fidelity, structured dataset specifically curated for emergency vehicle siren detection. Second, we design E2PANNs, a hardware-efficient convolutional neural network optimized for low-power inference. Third, we develop a multithreaded inference engine integrating adaptive frame scheduling, probabilistic smoothing, and a finite-state machine–based decision module. Contribution/Results: On a Raspberry Pi 5, the system achieves end-to-end average latency <300 ms, event-level F1-score of 92.7%, and reduces false trigger rate by 68%. It supports WebSocket-based remote monitoring and scalable deployment in distributed acoustic sensor networks. Experimental results demonstrate the feasibility of leveraging low-cost edge devices collaboratively for real-time emergency vehicle tracking in smart cities.

Technology Category

Application Category

📝 Abstract
We present a full-stack emergency vehicle (EV) siren detection system designed for real-time deployment on embedded hardware. The proposed approach is based on E2PANNs, a fine-tuned convolutional neural network derived from EPANNs, and optimized for binary sound event detection under urban acoustic conditions. A key contribution is the creation of curated and semantically structured datasets - AudioSet-EV, AudioSet-EV Augmented, and Unified-EV - developed using a custom AudioSet-Tools framework to overcome the low reliability of standard AudioSet annotations. The system is deployed on a Raspberry Pi 5 equipped with a high-fidelity DAC+microphone board, implementing a multithreaded inference engine with adaptive frame sizing, probability smoothing, and a decision-state machine to control false positive activations. A remote WebSocket interface provides real-time monitoring and facilitates live demonstration capabilities. Performance is evaluated using both framewise and event-based metrics across multiple configurations. Results show the system achieves low-latency detection with improved robustness under realistic audio conditions. This work demonstrates the feasibility of deploying IoS-compatible SED solutions that can form distributed acoustic monitoring networks, enabling collaborative emergency vehicle tracking across smart city infrastructures through WebSocket connectivity on low-cost edge devices.
Problem

Research questions and friction points this paper is trying to address.

Detect emergency vehicle sirens in real-time using efficient CNNs
Optimize binary sound event detection for urban acoustic conditions
Deploy low-latency SED solutions on low-cost embedded hardware
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient CNNs for real-time siren detection
Custom datasets with AudioSet-Tools framework
Multithreaded inference on Raspberry Pi 5
🔎 Similar Papers
No similar papers found.
M
Marco Giordano
Dpt. of Information Engineering, Computer Science and Mathematics (DISIM), University of L’Aquila, L’Aquila, Italy
S
Stefano Giacomelli
Dpt. of Information Engineering, Computer Science and Mathematics (DISIM), University of L’Aquila, L’Aquila, Italy
Claudia Rinaldi
Claudia Rinaldi
CNIT - National Inter-University Consortium for Telecommunications
wireless communicationsdigital signal processingmultimedia
Fabio Graziosi
Fabio Graziosi
University of L'Aquila - Italy