🤖 AI Summary
Single-image deraining remains highly challenging due to the structural characteristics of rain streaks, including their multi-scale nature, strong directional patterns, and concentration in the frequency domain. To address this, this work proposes SpectralDiff, a novel framework that introduces structured spectral perturbations during the diffusion process to progressively suppress multi-directional rain components. The method employs a fully multiplicative U-Net architecture, leveraging the convolution theorem to reformulate spatial convolutions as element-wise products in the frequency domain, thereby preserving modeling capacity while significantly enhancing computational efficiency. Extensive experiments demonstrate that SpectralDiff achieves state-of-the-art performance on both synthetic and real-world rainy image datasets, with a more compact model size and faster inference speed compared to existing diffusion-based approaches.
📝 Abstract
Rain streaks manifest as directional and frequency-concentrated structures that overlap across multiple scales, making single-image rain removal particularly challenging. While diffusion-based restoration models provide a powerful framework for progressive denoising, standard spatial-domain diffusion does not explicitly account for such structured spectral characteristics. We introduce SpectralDiff, a spectral-structured diffusion-based framework tailored for single-image rain removal. Rather than redefining the diffusion formulation, our method incorporates structured spectral perturbations to guide the progressive suppression of multi-directional rain components. To support this design, we further propose a full-product U-Net architecture that leverages the convolution theorem to replace convolution operations with element-wise product layers, improving computational efficiency while preserving modeling capacity. Extensive experiments on synthetic and real-world benchmarks demonstrate that SpectralDiff achieves competitive rain removal performance with improved model compactness and favorable inference efficiency compared to existing diffusion-based approaches.