🤖 AI Summary
Infrared small target detection (IRSTD) remains challenging due to low signal-to-clutter ratio, highly variable target morphology, and absence of salient visual cues. To address these issues, this paper proposes LRRNet—a fully end-to-end deep framework that directly learns structure-aware, low-rank background representations in the image domain, bypassing conventional patch-based processing and explicit matrix decomposition. Physically grounded, it exploits the inherent compressibility prior of infrared scenes. Key contributions include: (1) a novel Compression–Reconstruction–Subtraction (CRS) paradigm; (2) unsupervised, end-to-end modeling of low-rank structural priors via deep neural networks; and (3) complete elimination of patch-based design. Extensive experiments demonstrate state-of-the-art performance across multiple public benchmarks, outperforming 38 existing methods. The model achieves an average inference speed of 82.34 FPS and maintains robustness under severe noise, as validated on the NoisySIRST dataset.
📝 Abstract
Infrared small target detection (IRSTD) remains a long-standing challenge in complex backgrounds due to low signal-to-clutter ratios (SCR), diverse target morphologies, and the absence of distinctive visual cues. While recent deep learning approaches aim to learn discriminative representations, the intrinsic variability and weak priors of small targets often lead to unstable performance. In this paper, we propose a novel end-to-end IRSTD framework, termed LRRNet, which leverages the low-rank property of infrared image backgrounds. Inspired by the physical compressibility of cluttered scenes, our approach adopts a compression--reconstruction--subtraction (CRS) paradigm to directly model structure-aware low-rank background representations in the image domain, without relying on patch-based processing or explicit matrix decomposition. To the best of our knowledge, this is the first work to directly learn low-rank background structures using deep neural networks in an end-to-end manner. Extensive experiments on multiple public datasets demonstrate that LRRNet outperforms 38 state-of-the-art methods in terms of detection accuracy, robustness, and computational efficiency. Remarkably, it achieves real-time performance with an average speed of 82.34 FPS. Evaluations on the challenging NoisySIRST dataset further confirm the model's resilience to sensor noise. The source code will be made publicly available upon acceptance.