๐ค AI Summary
In infrared imaging, densely packed small targets exhibit overlapping signals, impeding accurate target counting, sub-pixel localization, and radiometric intensity inversion. To address this, we propose the Dynamic Iterative Shrinkage Thresholding Network (DIST-Net) for unmixing closely spaced infrared small targets. Methodologically, DIST-Net is the first to jointly model sparse reconstruction, sub-pixel localization, and photometric inversion via dynamically learnable convolutional weights and threshold parameters. Our contributions include: (1) releasing the first open-source ecosystemโCSIST-100K, a large-scale synthetic dataset; CSO-mAP, a task-specific evaluation metric; and GrokCSO, an end-to-end inference toolkit; (2) achieving state-of-the-art performance on public benchmarks for end-to-end unmixing of adjacent small targets, with significant improvements in both sub-pixel detection accuracy and radiometric estimation fidelity.
๐ Abstract
Resolving closely-spaced small targets in dense clusters presents a significant challenge in infrared imaging, as the overlapping signals hinder precise determination of their quantity, sub-pixel positions, and radiation intensities. While deep learning has advanced the field of infrared small target detection, its application to closely-spaced infrared small targets has not yet been explored. This gap exists primarily due to the complexity of separating superimposed characteristics and the lack of an open-source infrastructure. In this work, we propose the Dynamic Iterative Shrinkage Thresholding Network (DISTA-Net), which reconceptualizes traditional sparse reconstruction within a dynamic framework. DISTA-Net adaptively generates convolution weights and thresholding parameters to tailor the reconstruction process in real time. To the best of our knowledge, DISTA-Net is the first deep learning model designed specifically for the unmixing of closely-spaced infrared small targets, achieving superior sub-pixel detection accuracy. Moreover, we have established the first open-source ecosystem to foster further research in this field. This ecosystem comprises three key components: (1) CSIST-100K, a publicly available benchmark dataset; (2) CSO-mAP, a custom evaluation metric for sub-pixel detection; and (3) GrokCSO, an open-source toolkit featuring DISTA-Net and other models. Our code and dataset are available at https://github.com/GrokCV/GrokCSO.