🤖 AI Summary
Addressing three key challenges in UAV-based multispectral remote sensing—degraded modality complementarity under low-light conditions, susceptibility of small targets to redundant information, and high computational complexity hindering deployment of Transformer-based models—this paper proposes a dual-domain enhancement and priority-guided Mamba fusion detection framework. The framework comprises: (1) a cross-scale Wavelet-Mamba module coupled with a Fourier-domain detail restoration module to jointly enhance spatial-textural and frequency-structural features in low-light imagery; and (2) a modality-difference-driven priority scanning strategy that enables linear-complexity, interference-robust multispectral feature fusion via Mamba. Evaluated on the DroneVehicle and VEDAI benchmarks, the method achieves significant improvements in small-object detection accuracy while satisfying stringent real-time inference and onboard resource constraints of UAV platforms.
📝 Abstract
Multispectral remote sensing object detection is one of the important application of unmanned aerial vehicle (UAV). However, it faces three challenges. Firstly, the low-light remote sensing images reduce the complementarity during multi-modality fusion. Secondly, the local small target modeling is interfered with redundant information in the fusion stage easily. Thirdly, due to the quadratic computational complexity, it is hard to apply the transformer-based methods on the UAV platform. To address these limitations, motivated by Mamba with linear complexity, a UAV multispectral object detector with dual-domain enhancement and priority-guided mamba fusion (DEPF) is proposed. Firstly, to enhance low-light remote sensing images, Dual-Domain Enhancement Module (DDE) is designed, which contains Cross-Scale Wavelet Mamba (CSWM) and Fourier Details Recovery block (FDR). CSWM applies cross-scale mamba scanning for the low-frequency components to enhance the global brightness of images, while FDR constructs spectrum recovery network to enhance the frequency spectra features for recovering the texture-details. Secondly, to enhance local target modeling and reduce the impact of redundant information during fusion, Priority-Guided Mamba Fusion Module (PGMF) is designed. PGMF introduces the concept of priority scanning, which starts from local targets features according to the priority scores obtained from modality difference. Experiments on DroneVehicle dataset and VEDAI dataset reports that, DEPF performs well on object detection, comparing with state-of-the-art methods. Our code is available in the supplementary material.