🤖 AI Summary
Underwater salient object detection (USOD) suffers from severe image degradation and domain shift; existing methods typically treat degradation as noise, overlooking its underlying physical cues. To address this, we propose a physics-guided temporal diffusion model that explicitly incorporates underwater imaging priors—such as wavelength-dependent attenuation and backscattering—into the diffusion process via a rectified flow framework, ensuring physically consistent feature evolution. Additionally, we introduce temporal modeling across diffusion steps to enhance long-range saliency representation. Our approach overcomes the limitations of conventional USOD methods, which either implicitly encode physical mechanisms or apply simplistic denoising. Evaluated on the USOD10K benchmark, our method achieves a 0.072 improvement in the structural-measure (Sₘ), surpassing state-of-the-art approaches. This demonstrates the effectiveness and generalizability of synergistically integrating physical priors with temporal dynamics in diffusion-based USOD.
📝 Abstract
Underwater Salient Object Detection (USOD) faces significant challenges, including underwater image quality degradation and domain gaps. Existing methods tend to ignore the physical principles of underwater imaging or simply treat degradation phenomena in underwater images as interference factors that must be eliminated, failing to fully exploit the valuable information they contain. We propose WaterFlow, a rectified flow-based framework for underwater salient object detection that innovatively incorporates underwater physical imaging information as explicit priors directly into the network training process and introduces temporal dimension modeling, significantly enhancing the model's capability for salient object identification. On the USOD10K dataset, WaterFlow achieves a 0.072 gain in S_m, demonstrating the effectiveness and superiority of our method. The code will be published after the acceptance.