🤖 AI Summary
Small-object detection (SOD) suffers from persistent accuracy–efficiency bottlenecks due to low spatial resolution, insufficient contextual cues, and challenges including occlusion, background clutter, and class imbalance. This survey comprehensively reviews state-of-the-art SOD advances from 2024–2025, introducing— for the first time—a unified, cross-paradigm framework integrating multi-scale feature fusion, super-resolution reconstruction, attention mechanisms, Vision Transformers, model lightweighting, knowledge distillation, and self-supervised pretraining, specifically optimized for edge devices and unmanned aerial vehicles (UAVs). We systematically consolidate major benchmark datasets and evaluation protocols, with particular emphasis on size-aware average precision (AP). Empirical validation across four real-world application domains—traffic monitoring, maritime security, industrial quality inspection, and smart agriculture—demonstrates substantial improvements in small-object mean AP and practical deployment feasibility.
📝 Abstract
Small object detection (SOD) is a critical yet challenging task in computer vision, with applications like spanning surveillance, autonomous systems, medical imaging, and remote sensing. Unlike larger objects, small objects contain limited spatial and contextual information, making accurate detection difficult. Challenges such as low resolution, occlusion, background interference, and class imbalance further complicate the problem. This survey provides a comprehensive review of recent advancements in SOD using deep learning, focusing on articles published in Q1 journals during 2024-2025. We analyzed challenges, state-of-the-art techniques, datasets, evaluation metrics, and real-world applications. Recent advancements in deep learning have introduced innovative solutions, including multi-scale feature extraction, Super-Resolution (SR) techniques, attention mechanisms, and transformer-based architectures. Additionally, improvements in data augmentation, synthetic data generation, and transfer learning have addressed data scarcity and domain adaptation issues. Furthermore, emerging trends such as lightweight neural networks, knowledge distillation (KD), and self-supervised learning offer promising directions for improving detection efficiency, particularly in resource-constrained environments like Unmanned Aerial Vehicles (UAV)-based surveillance and edge computing. We also review widely used datasets, along with standard evaluation metrics such as mean Average Precision (mAP) and size-specific AP scores. The survey highlights real-world applications, including traffic monitoring, maritime surveillance, industrial defect detection, and precision agriculture. Finally, we discuss open research challenges and future directions, emphasizing the need for robust domain adaptation techniques, better feature fusion strategies, and real-time performance optimization.