🤖 AI Summary
Real-time detection of tiny objects (<32 pixels, down to 4 pixels) in UAV imagery remains challenging due to high false-positive rates and computational constraints on resource-limited platforms. To address this, we propose a hierarchical lightweight YOLOv8 variant tailored for edge deployment. Our key contributions are: (1) a Hierarchical Extended Path Aggregation Network (HEPAN) to enhance multi-scale feature fusion; (2) lightweight Inverted Residual Depthwise Separable Convolution Blocks (IRDCBs) and Lightweight Downsampling modules (LDowns) that substantially reduce computational overhead; and (3) a high-resolution detection head specifically optimized for ultra-small targets. Evaluated on VisDrone2019, our model reduces parameter count by 32.7% and FLOPs by 41.5%, while improving mAP by 2.8% and significantly lowering false detection rates. It achieves 37 FPS on Jetson AGX Orin, striking an effective balance between accuracy and real-time inference capability.
📝 Abstract
The real-time detection of small objects in complex scenes, such as the unmanned aerial vehicle (UAV) photography captured by drones, has dual challenges of detecting small targets (<32 pixels) and maintaining real-time efficiency on resource-constrained platforms. While YOLO-series detectors have achieved remarkable success in real-time large object detection, they suffer from significantly higher false negative rates for drone-based detection where small objects dominate, compared to large object scenarios. This paper proposes HierLight-YOLO, a hierarchical feature fusion and lightweight model that enhances the real-time detection of small objects, based on the YOLOv8 architecture. We propose the Hierarchical Extended Path Aggregation Network (HEPAN), a multi-scale feature fusion method through hierarchical cross-level connections, enhancing the small object detection accuracy. HierLight-YOLO includes two innovative lightweight modules: Inverted Residual Depthwise Convolution Block (IRDCB) and Lightweight Downsample (LDown) module, which significantly reduce the model's parameters and computational complexity without sacrificing detection capabilities. Small object detection head is designed to further enhance spatial resolution and feature fusion to tackle the tiny object (4 pixels) detection. Comparison experiments and ablation studies on the VisDrone2019 benchmark demonstrate state-of-the-art performance of HierLight-YOLO.