GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution

📅 2025-05-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing diffusion-based super-resolution methods rely on VAE-downsampled features as conditioning inputs, leading to degraded structural fidelity. To address this, we propose GuideSR—a single-step diffusion super-resolution model featuring a novel dual-branch collaborative architecture. The Guidance Branch preserves full-resolution structural priors by directly feeding the degraded input image through an Image Guidance Network (IGN), while the Diffusion Branch leverages a pre-trained Latent Diffusion Model (LDM) to enhance texture details without coarse-grained conditional injection. GuideSR further introduces guided attention and channel-wise attention mechanisms to jointly optimize structural accuracy and perceptual quality. Evaluated on real-world datasets, GuideSR achieves state-of-the-art performance in a single diffusion step: +1.39 dB PSNR, and superior SSIM, LPIPS, DISTS, and FID scores—demonstrating both high-fidelity reconstruction and efficient inference.

Technology Category

Application Category

📝 Abstract
In this paper, we propose GuideSR, a novel single-step diffusion-based image super-resolution (SR) model specifically designed to enhance image fidelity. Existing diffusion-based SR approaches typically adapt pre-trained generative models to image restoration tasks by adding extra conditioning on a VAE-downsampled representation of the degraded input, which often compromises structural fidelity. GuideSR addresses this limitation by introducing a dual-branch architecture comprising: (1) a Guidance Branch that preserves high-fidelity structures from the original-resolution degraded input, and (2) a Diffusion Branch, which a pre-trained latent diffusion model to enhance perceptual quality. Unlike conventional conditioning mechanisms, our Guidance Branch features a tailored structure for image restoration tasks, combining Full Resolution Blocks (FRBs) with channel attention and an Image Guidance Network (IGN) with guided attention. By embedding detailed structural information directly into the restoration pipeline, GuideSR produces sharper and more visually consistent results. Extensive experiments on benchmark datasets demonstrate that GuideSR achieves state-of-the-art performance while maintaining the low computational cost of single-step approaches, with up to 1.39dB PSNR gain on challenging real-world datasets. Our approach consistently outperforms existing methods across various reference-based metrics including PSNR, SSIM, LPIPS, DISTS and FID, further representing a practical advancement for real-world image restoration.
Problem

Research questions and friction points this paper is trying to address.

Enhancing image fidelity in single-step super-resolution
Addressing structural fidelity loss in diffusion-based SR
Balancing perceptual quality and computational efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-branch architecture enhances image fidelity
Full Resolution Blocks with channel attention
Image Guidance Network with guided attention
🔎 Similar Papers
No similar papers found.