Controllable Reference-Based Real-World Remote Sensing Image Super-Resolution with Generative Diffusion Priors

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

192K/year
🤖 AI Summary
Existing reference-based super-resolution (RefSR) methods suffer from insufficient detail recovery and excessive reliance on reference images in real-world remote sensing scenarios, primarily due to cross-sensor resolution discrepancies and dynamic land surface changes. To address these challenges, this paper proposes CRefDiff: a reference-guided generative diffusion framework built upon a pre-trained Stable Diffusion model. It introduces a dual-branch adaptive fusion mechanism to jointly model local textures and global structures, a controllable reference-strength modulation module for inference-time adjustment, and a “Better Start” accelerated denoising strategy. Evaluated on the newly constructed real-world dataset Real-RefRSSRD, CRefDiff achieves significant performance gains over state-of-the-art RefSR methods. Moreover, it consistently improves downstream tasks—including scene classification and semantic segmentation—demonstrating superior robustness and generalization under complex, time-varying land surface conditions.

Technology Category

Application Category

📝 Abstract
Super-resolution (SR) techniques can enhance the spatial resolution of remote sensing images by utilizing low-resolution (LR) images to reconstruct high-resolution (HR) images, enabling more efficient large-scale earth observation applications. While single-image super-resolution (SISR) methods have shown progress, reference-based super-resolution (RefSR) offers superior performance by incorporating historical HR images alongside current LR observations. However, existing RefSR methods struggle with real-world complexities, such as cross-sensor resolution gap and significant land cover changes, often leading to under-generation or over-reliance on reference image. To address these challenges, we propose CRefDiff, a novel controllable reference-based diffusion model for real-world remote sensing image SR. To address the under-generation problem, CRefDiff is built upon the pretrained Stable Diffusion model, leveraging its powerful generative prior to produce accurate structures and textures. To mitigate over-reliance on the reference, we introduce a dual-branch fusion mechanism that adaptively integrates both local and global information from the reference image. Moreover, this novel dual-branch design enables reference strength control during inference, enhancing interactivity and flexibility of the model. Finally, a strategy named Better Start is proposed to significantly reduce the number of denoising steps, thereby accelerating the inference process. To support further research, we introduce Real-RefRSSRD, a new real-world RefSR dataset for remote sensing images, consisting of HR NAIP and LR Sentinel-2 image pairs with diverse land cover changes and significant temporal gaps. Extensive experiments on Real-RefRSSRD show that CRefDiff achieves state-of-the-art performance across various metrics and improves downstream tasks such as scene classification and semantic segmentation.
Problem

Research questions and friction points this paper is trying to address.

Enhance remote sensing image resolution using reference-based super-resolution
Address under-generation and over-reliance on reference images in RefSR
Reduce denoising steps for faster inference in diffusion models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages pretrained Stable Diffusion model
Introduces dual-branch fusion mechanism
Proposes Better Start strategy acceleration
🔎 Similar Papers
No similar papers found.