🤖 AI Summary
Conventional spatially-invariant super-resolution (SR) models fail under real-world imaging degradations that vary with scene depth—such as atmospheric scattering and defocus blur. To address this, we propose the first theory-driven, depth-aware SR framework. Methodologically, we formulate a distance-adaptive variational model incorporating depth-conditioned convolutions and distance-dependent pseudo-differential operators to jointly model degradation spectra and enable dynamic regularization of the energy functional. We further design a cascaded residual gradient flow architecture integrating atmospheric scattering spectral constraints, a learnable depth-mapping filter, and a depth-guided adaptive kernel generation network. Evaluated on five benchmarks including KITTI, our method achieves 36.89 dB PSNR / 0.9516 SSIM for 2× SR and 30.54 dB PSNR / 0.8721 SSIM for 4× SR—surpassing state-of-the-art methods by 0.44 dB and 0.36 dB, respectively.
📝 Abstract
Single image super-resolution traditionally assumes spatially-invariant degradation models, yet real-world imaging systems exhibit complex distance-dependent effects including atmospheric scattering, depth-of-field variations, and perspective distortions. This fundamental limitation necessitates spatially-adaptive reconstruction strategies that explicitly incorporate geometric scene understanding for optimal performance. We propose a rigorous variational framework that characterizes super-resolution as a spatially-varying inverse problem, formulating the degradation operator as a pseudodifferential operator with distance-dependent spectral characteristics that enable theoretical analysis of reconstruction limits across depth ranges. Our neural architecture implements discrete gradient flow dynamics through cascaded residual blocks with depth-conditional convolution kernels, ensuring convergence to stationary points of the theoretical energy functional while incorporating learned distance-adaptive regularization terms that dynamically adjust smoothness constraints based on local geometric structure. Spectral constraints derived from atmospheric scattering theory prevent bandwidth violations and noise amplification in far-field regions, while adaptive kernel generation networks learn continuous mappings from depth to reconstruction filters. Comprehensive evaluation across five benchmark datasets demonstrates state-of-the-art performance, achieving 36.89/0.9516 and 30.54/0.8721 PSNR/SSIM at 2 and 4 scales on KITTI outdoor scenes, outperforming existing methods by 0.44dB and 0.36dB respectively. This work establishes the first theoretically-grounded distance-adaptive super-resolution framework and demonstrates significant improvements on depth-variant scenarios while maintaining competitive performance across traditional benchmarks.