MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection

📅 2025-06-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Small-object detection in UAV imagery is hindered by extremely small object sizes, low signal-to-noise ratios, and cluttered backgrounds; existing multi-scale approaches often compromise fine-grained detail or incur excessive computational overhead. To address this, we propose a lightweight multi-scale global–local feature fusion framework featuring a novel Fusion Lock mechanism. This mechanism jointly integrates Token-Statistics self-attention (for long-range semantic modeling), directional convolution with parallel attention (to enhance local structural perception), and dynamic pixel-wise weighting (to suppress background interference), enabling efficient and precise global–local feature coupling. Evaluated on the VisDrone benchmark, our method consistently outperforms state-of-the-art approaches across diverse backbone networks and detector architectures, achieving significant gains in both precision and recall while maintaining real-time inference speed—making it well-suited for resource-constrained onboard UAV platforms.

Technology Category

Application Category

📝 Abstract
Small object detection in UAV imagery is crucial for applications such as search-and-rescue, traffic monitoring, and environmental surveillance, but it is hampered by tiny object size, low signal-to-noise ratios, and limited feature extraction. Existing multi-scale fusion methods help, but add computational burden and blur fine details, making small object detection in cluttered scenes difficult. To overcome these challenges, we propose the Multi-scale Global-detail Feature Integration Strategy (MGDFIS), a unified fusion framework that tightly couples global context with local detail to boost detection performance while maintaining efficiency. MGDFIS comprises three synergistic modules: the FusionLock-TSS Attention Module, which marries token-statistics self-attention with DynamicTanh normalization to highlight spectral and spatial cues at minimal cost; the Global-detail Integration Module, which fuses multi-scale context via directional convolution and parallel attention while preserving subtle shape and texture variations; and the Dynamic Pixel Attention Module, which generates pixel-wise weighting maps to rebalance uneven foreground and background distributions and sharpen responses to true object regions. Extensive experiments on the VisDrone benchmark demonstrate that MGDFIS consistently outperforms state-of-the-art methods across diverse backbone architectures and detection frameworks, achieving superior precision and recall with low inference time. By striking an optimal balance between accuracy and resource usage, MGDFIS provides a practical solution for small-object detection on resource-constrained UAV platforms.
Problem

Research questions and friction points this paper is trying to address.

Improves small object detection in UAV imagery
Reduces computational burden while preserving fine details
Balances accuracy and efficiency for resource-constrained platforms
Innovation

Methods, ideas, or system contributions that make the work stand out.

FusionLock-TSS Attention Module enhances spectral and spatial cues
Global-detail Integration Module preserves shape and texture variations
Dynamic Pixel Attention Module rebalances foreground and background
Y
Yuxiang Wang
School of Computer Science, The University of Sydney, NSW, Australia
Xuecheng Bai
Xuecheng Bai
Shenyang Ligong University
Object DetectionLow-light Image Enchance
B
Boyu Hu
School of Statistics, University of International Business and Economics, Beijing, China
Chuanzhi Xu
Chuanzhi Xu
Student, The University of Sydney
Neuromorphic VisionHigh-level VisionComputational Aesthetics
H
Haodong Chen
School of Computer Science, The University of Sydney, NSW, Australia
V
Vera Chung
School of Computer Science, The University of Sydney, NSW, Australia
T
Tingxue Li
School of Automation and Electrical Engineering, Shenyang Ligong University, Liaoning, China