FSGNet: A Frequency-Aware and Semantic Guidance Network for Infrared Small Target Detection

📅 2026-03-26

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work proposes FSGNet, a lightweight framework designed to address semantic degradation in deep-to-shallow feature propagation within U-Net architectures for infrared small target detection. The method integrates a multi-directional interactive attention mechanism, a multi-scale frequency-domain perception module, and a global semantic guidance flow. Specifically, it leverages fast Fourier transform to suppress target-like clutter, employs global pooling followed by upsampling to preserve cross-scale semantic consistency, and enhances fine-grained feature extraction through multi-directional attention. Extensive experiments on four public IRSTD datasets demonstrate that FSGNet significantly improves detection accuracy for low-contrast small targets while maintaining high computational efficiency and robustness, highlighting its strong potential for real-world applications.

Technology Category

Application Category

📝 Abstract

Infrared small target detection (IRSTD) aims to identify and distinguish small targets from complex backgrounds. Leveraging the powerful multi-scale feature fusion capability of the U-Net architecture, IRSTD has achieved significant progress. However, U-Net suffers from semantic degradation when transferring high-level features from deep to shallow layers, limiting the precise localization of small targets. To address this issue, this paper proposes FSGNet, a lightweight and effective detection framework incorporating frequency-aware and semantic guidance mechanisms. Specifically, a multi-directional interactive attention module is proposed throughout the encoder to capture fine-grained and directional features, enhancing the network's sensitivity to small, low-contrast targets. To suppress background interference propagated through skip connections, a multi-scale frequency-aware module leverages Fast Fourier transform to filter out target-similar clutter while preserving salient target structures. At the deepest layer, a global pooling module captures high-level semantic information, which is subsequently upsampled and propagated to each decoder stage through the global semantic guidance flows, ensuring semantic consistency and precise localization across scales. Extensive experiments on four public IRSTD datasets demonstrate that FSGNet achieves superior detection performance and maintains high efficiency, highlighting its practical applicability and robustness. The codes will be released on https://github.com/Wangtao-Bao/FSGNet.

Problem

Research questions and friction points this paper is trying to address.

Infrared small target detection

Semantic degradation

Precise localization

Background interference

Multi-scale feature fusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

frequency-aware

semantic guidance

multi-directional attention