A Wavelet Diffusion GAN for Image Super-Resolution

📅 2024-10-23

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

199K/year

🤖 AI Summary

To address the slow training and inference of diffusion models in single-image super-resolution (SISR), hindering real-time deployment, this paper proposes a wavelet-domain conditional Diffusion GAN framework. It innovatively integrates discrete wavelet transform (DWT) into the diffusion process, enabling conditional generation and multi-scale residual learning directly in the wavelet domain—thereby significantly reducing feature dimensionality and backward diffusion steps. By synergistically combining the high-fidelity synthesis capability of diffusion models with the efficient modeling strength of GANs, our method achieves new state-of-the-art performance on CelebA-HQ: it improves PSNR and SSIM substantially, accelerates training by 2.3×, and reduces per-image inference time to 18 ms for ×4 SISR. To our knowledge, this is the first approach to enable high-fidelity, real-time super-resolution.

Technology Category

Application Category

📝 Abstract

In recent years, diffusion models have emerged as a superior alternative to generative adversarial networks (GANs) for high-fidelity image generation, with wide applications in text-to-image generation, image-to-image translation, and super-resolution. However, their real-time feasibility is hindered by slow training and inference speeds. This study addresses this challenge by proposing a wavelet-based conditional Diffusion GAN scheme for Single-Image Super-Resolution (SISR). Our approach utilizes the diffusion GAN paradigm to reduce the timesteps required by the reverse diffusion process and the Discrete Wavelet Transform (DWT) to achieve dimensionality reduction, decreasing training and inference times significantly. The results of an experimental validation on the CelebA-HQ dataset confirm the effectiveness of our proposed scheme. Our approach outperforms other state-of-the-art methodologies successfully ensuring high-fidelity output while overcoming inherent drawbacks associated with diffusion models in time-sensitive applications.

Problem

Research questions and friction points this paper is trying to address.

Improving real-time feasibility of diffusion models for super-resolution

Reducing training and inference time with wavelet-based GAN

Enhancing high-fidelity image output in time-sensitive applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Wavelet-based conditional Diffusion GAN

Reduces reverse diffusion timesteps

Uses DWT for dimensionality reduction

🔎 Similar Papers

IG-CFAT: An Improved GAN-Based Framework for Effectively Exploiting Transformers in Real-World Image Super-Resolution