🤖 AI Summary
Existing underwater image enhancement methods struggle to disentangle cross-channel interference when leveraging signal-to-noise ratio (SNR) priors to suppress wavelength-dependent attenuation, and simultaneously balance structural enhancement with noise suppression. To address these challenges, this paper proposes a frequency-domain SNR-guided Transformer model. We introduce the SNR prior into the frequency domain for the first time, designing Fourier attention (SNR-aware) and frequency-adaptive gating mechanisms to achieve amplitude-phase decoupling, cross-channel interference separation, and joint optimization of low- and high-frequency components. A U-shaped dual-stream architecture is adopted to improve feature reuse efficiency. Trained on 4,800 image pairs, our method achieves a 3.1 dB PSNR gain and a 0.08 SSIM improvement over state-of-the-art approaches. It significantly restores color fidelity, texture details, and global contrast. This work establishes an interpretable, frequency-domain-driven paradigm for underwater visual enhancement.
📝 Abstract
Recent learning-based underwater image enhancement (UIE) methods have advanced by incorporating physical priors into deep neural networks, particularly using the signal-to-noise ratio (SNR) prior to reduce wavelength-dependent attenuation. However, spatial domain SNR priors have two limitations: (i) they cannot effectively separate cross-channel interference, and (ii) they provide limited help in amplifying informative structures while suppressing noise. To overcome these, we propose using the SNR prior in the frequency domain, decomposing features into amplitude and phase spectra for better channel modulation. We introduce the Fourier Attention SNR-prior Transformer (FAST), combining spectral interactions with SNR cues to highlight key spectral components. Additionally, the Frequency Adaptive Transformer (FAT) bottleneck merges low- and high-frequency branches using a gated attention mechanism to enhance perceptual quality. Embedded in a unified U-shaped architecture, these modules integrate a conventional RGB stream with an SNR-guided branch, forming SFormer. Trained on 4,800 paired images from UIEB, EUVP, and LSUI, SFormer surpasses recent methods with a 3.1 dB gain in PSNR and 0.08 in SSIM, successfully restoring colors, textures, and contrast in underwater scenes.