SHIELD8-UAV: Sequential 8-bit Hardware Implementation of a Precision-Aware 1D-F-CNN for Low-Energy UAV Acoustic Detection and Temporal Tracking

📅 2026-03-01

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work proposes a serial 8-bit hardware accelerator based on a shared multi-precision datapath to address the low-power and low-latency requirements of real-time acoustic drone detection and temporal tracking on edge devices. The design employs a layer-wise, 1D feature-driven CNN architecture that integrates mixed-precision quantization (FP32/BF16/INT8/FXP8), structured channel pruning, and serialized dense layer processing to significantly reduce computational complexity while preserving high accuracy. Implemented on a Pynq-Z2 FPGA, the system achieves 89.91% detection accuracy in FP32 mode, with less than 2.5% accuracy degradation in 8-bit mode, consuming only 0.94 W and exhibiting an end-to-end latency of 116 ms. It also reduces logic resource usage by 5–9% compared to parallel counterparts. ASIC synthesis results show a core area of 3.29 mm², a maximum operating frequency of 1.56 GHz, and a total power consumption of 1.65 W.

Technology Category

Application Category

📝 Abstract

Real-time unmanned aerial vehicle (UAV) acoustic detection at the edge demands low-latency inference under strict power and hardware limits. This paper presents SHIELD8-UAV, a sequential 8-bit hardware implementation of a precision-aware 1D feature-driven CNN (1D-F-CNN) accelerator for continuous acoustic monitoring. The design performs layer-wise execution on a shared multi-precision datapath, eliminating the need for replicated processing elements. A layer-sensitivity quantisation framework supports FP32, BF16, INT8, and FXP8 modes, while structured channel pruning reduces the flattened feature dimension from 35,072 to 8,704 (75%), thereby lowering serialised dense-layer cycles. The model achieves 89.91% detection accuracy in FP32 with less than 2.5% degradation in 8-bit modes. The accelerator uses 2,268 LUTs and 0.94 W power with 116 ms end-to-end latency, achieving 37.8% and 49.6% latency reduction compared with QuantMAC and LPRE, respectively, on a Pynq-Z2 FPGA, and 5-9% lower logic usage than parallel designs. ASIC synthesis in UMC 40 nm technology shows a maximum operating frequency of 1.56 GHz, 3.29 mm2 core area, and 1.65 W total power. These results demonstrate that sequential execution combined with precision-aware quantisation and serialisation-aware pruning enables practical low-energy edge inference without relying on massive parallelism.

Problem

Research questions and friction points this paper is trying to address.

UAV acoustic detection

low-energy edge inference

real-time detection

hardware constraints

temporal tracking

Innovation

Methods, ideas, or system contributions that make the work stand out.

sequential execution

precision-aware quantization

structured channel pruning