Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution

📅 2025-05-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Image super-resolution at high scaling factors struggles to simultaneously achieve perceptual realism and structural fidelity. To address this, we propose ResQu, the first framework to integrate quaternion wavelet transform into latent diffusion models, enabling a wavelet–temporal-aware encoder that performs dynamic, multi-stage conditional denoising control. This design overcomes the limitations of static coupling between conventional wavelet representations and diffusion processes, facilitating effective transfer of foundational generative priors—such as those from Stable Diffusion—to super-resolution tasks. Evaluated on domain-specific benchmarks, ResQu achieves significant improvements in PSNR, SSIM, and perceptual quality metrics, consistently outperforming state-of-the-art methods. It establishes a new paradigm for high-magnification super-resolution that jointly ensures texture authenticity and geometric consistency.

Technology Category

Application Category

📝 Abstract
Image Super-Resolution is a fundamental problem in computer vision with broad applications spacing from medical imaging to satellite analysis. The ability to reconstruct high-resolution images from low-resolution inputs is crucial for enhancing downstream tasks such as object detection and segmentation. While deep learning has significantly advanced SR, achieving high-quality reconstructions with fine-grained details and realistic textures remains challenging, particularly at high upscaling factors. Recent approaches leveraging diffusion models have demonstrated promising results, yet they often struggle to balance perceptual quality with structural fidelity. In this work, we introduce ResQu a novel SR framework that integrates a quaternion wavelet preprocessing framework with latent diffusion models, incorporating a new quaternion wavelet- and time-aware encoder. Unlike prior methods that simply apply wavelet transforms within diffusion models, our approach enhances the conditioning process by exploiting quaternion wavelet embeddings, which are dynamically integrated at different stages of denoising. Furthermore, we also leverage the generative priors of foundation models such as Stable Diffusion. Extensive experiments on domain-specific datasets demonstrate that our method achieves outstanding SR results, outperforming in many cases existing approaches in perceptual quality and standard evaluation metrics. The code will be available after the revision process.
Problem

Research questions and friction points this paper is trying to address.

Enhancing image super-resolution quality and detail preservation
Balancing perceptual quality with structural fidelity in SR
Integrating quaternion wavelets with diffusion models for SR
Innovation

Methods, ideas, or system contributions that make the work stand out.

Quaternion wavelet preprocessing with diffusion models
Dynamic quaternion wavelet embeddings in denoising
Leveraging Stable Diffusion generative priors
🔎 Similar Papers
No similar papers found.