🤖 AI Summary
To address three key challenges in hyperspectral image (HSI) super-resolution—excessive memory consumption, weak geometric structure modeling of ground objects, and unstable convergence of diffusion processes—this paper proposes a Geometry-enhanced Wavelet-based Diffusion Model (GWDM). Methodologically: (i) a wavelet-based encoder-decoder constructs a low-dimensional latent space to alleviate memory overhead from high-dimensional spectral data; (ii) a geometry-aware diffusion process is introduced to explicitly model the topological and spatial structural relationships among ground objects; (iii) a multi-level supervised loss function is designed to jointly optimize spectral-spatial consistency in the latent space. Evaluated on multiple public benchmarks for 4× super-resolution reconstruction, GWDM achieves state-of-the-art performance, significantly improving reconstruction fidelity, spectral accuracy, visual realism, and fine-detail preservation.
📝 Abstract
Improving the quality of hyperspectral images (HSIs), such as through super-resolution, is a crucial research area. However, generative modeling for HSIs presents several challenges. Due to their high spectral dimensionality, HSIs are too memory-intensive for direct input into conventional diffusion models. Furthermore, general generative models lack an understanding of the topological and geometric structures of ground objects in remote sensing imagery. In addition, most diffusion models optimize loss functions at the noise level, leading to a non-intuitive convergence behavior and suboptimal generation quality for complex data. To address these challenges, we propose a Geometric Enhanced Wavelet-based Diffusion Model (GEWDiff), a novel framework for reconstructing hyperspectral images at 4-times super-resolution. A wavelet-based encoder-decoder is introduced that efficiently compresses HSIs into a latent space while preserving spectral-spatial information. To avoid distortion during generation, we incorporate a geometry-enhanced diffusion process that preserves the geometric features. Furthermore, a multi-level loss function was designed to guide the diffusion process, promoting stable convergence and improved reconstruction fidelity. Our model demonstrated state-of-the-art results across multiple dimensions, including fidelity, spectral accuracy, visual realism, and clarity.