FSGNet: A Frequency-Aware and Semantic Guidance Network for Infrared Small Target Detection

📅 2026-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes FSGNet, a lightweight framework designed to address semantic degradation in deep-to-shallow feature propagation within U-Net architectures for infrared small target detection. The method integrates a multi-directional interactive attention mechanism, a multi-scale frequency-domain perception module, and a global semantic guidance flow. Specifically, it leverages fast Fourier transform to suppress target-like clutter, employs global pooling followed by upsampling to preserve cross-scale semantic consistency, and enhances fine-grained feature extraction through multi-directional attention. Extensive experiments on four public IRSTD datasets demonstrate that FSGNet significantly improves detection accuracy for low-contrast small targets while maintaining high computational efficiency and robustness, highlighting its strong potential for real-world applications.

Technology Category

Application Category

📝 Abstract
Infrared small target detection (IRSTD) aims to identify and distinguish small targets from complex backgrounds. Leveraging the powerful multi-scale feature fusion capability of the U-Net architecture, IRSTD has achieved significant progress. However, U-Net suffers from semantic degradation when transferring high-level features from deep to shallow layers, limiting the precise localization of small targets. To address this issue, this paper proposes FSGNet, a lightweight and effective detection framework incorporating frequency-aware and semantic guidance mechanisms. Specifically, a multi-directional interactive attention module is proposed throughout the encoder to capture fine-grained and directional features, enhancing the network's sensitivity to small, low-contrast targets. To suppress background interference propagated through skip connections, a multi-scale frequency-aware module leverages Fast Fourier transform to filter out target-similar clutter while preserving salient target structures. At the deepest layer, a global pooling module captures high-level semantic information, which is subsequently upsampled and propagated to each decoder stage through the global semantic guidance flows, ensuring semantic consistency and precise localization across scales. Extensive experiments on four public IRSTD datasets demonstrate that FSGNet achieves superior detection performance and maintains high efficiency, highlighting its practical applicability and robustness. The codes will be released on https://github.com/Wangtao-Bao/FSGNet.
Problem

Research questions and friction points this paper is trying to address.

Infrared small target detection
Semantic degradation
Precise localization
Background interference
Multi-scale feature fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

frequency-aware
semantic guidance
multi-directional attention
infrared small target detection
feature fusion
🔎 Similar Papers
No similar papers found.
Y
Yingmei Zhang
School of Software and Internet of Things Engineering, Jiangxi University of Finance and Economics, Nanchang 330032, China; also with CGN Begood Technology co., Ltd, Nanchang 330032, China
W
Wangtao Bao
School of Software and Internet of Things Engineering, Jiangxi University of Finance and Economics, Nanchang 330032, China
Yong Yang
Yong Yang
Professor of Computer Science, Tiangong University
Image processingInformation fusionMachine learningPattern recognitionDeep learning
Weiguo Wan
Weiguo Wan
Jiangxi University of Finance and Economics
Image ProcessingComputer VisionPattern RecognitionDeep Learning
Q
Qin Xiao
School of Information Management and Mathematics, Jiangxi University of Finance and Economics, Nanchang 330032, China
X
Xueting Zou
School of Software and Internet of Things Engineering, Jiangxi University of Finance and Economics, Nanchang 330032, China