🤖 AI Summary
To address overfitting and poor generalization of large models in few-shot super-resolution (SR) of unmanned aerial vehicle (UAV) infrared images, this paper proposes a Gaussian quantization representation learning framework coupled with a dynamic training monitoring mechanism, effectively mitigating overfitting and enhancing model robustness. Built upon a diffusion-based architecture, we introduce the first multi-source UAV infrared image few-shot SR benchmark dataset and systematically analyze the overfitting mechanisms of large models under limited data regimes. Experimental results on our curated dataset demonstrate that the proposed method significantly outperforms state-of-the-art SR approaches, achieving an average PSNR gain of 1.82 dB. Reconstructed images exhibit richer structural details and more accurate thermal source localization—particularly under challenging conditions such as complex backgrounds and low signal-to-noise ratios. This work establishes a novel paradigm for infrared visual enhancement under resource-constrained operational scenarios.
📝 Abstract
Although large scale models achieve significant improvements in performance, the overfitting challenge still frequently undermines their generalization ability. In super resolution tasks on images, diffusion models as representatives of generative models typically adopt large scale architectures. However, few-shot drone-captured infrared training data frequently induces severe overfitting in large-scale architectures. To address this key challenge, our method proposes a new Gaussian quantization representation learning method oriented to diffusion models that alleviates overfitting and enhances robustness. At the same time, an effective monitoring mechanism tracks large scale architectures during training to detect signs of overfitting. By introducing Gaussian quantization representation learning, our method effectively reduces overfitting while maintaining architecture complexity. On this basis, we construct a multi source drone-based infrared image benchmark dataset for detection and use it to emphasize overfitting issues of large scale architectures in few sample, drone-based diverse drone-based image reconstruction scenarios. To verify the efficacy of the method in mitigating overfitting, experiments are conducted on the constructed benchmark. Experimental results demonstrate that our method outperforms existing super resolution approaches and significantly mitigates overfitting of large scale architectures under complex conditions. The code and DroneSR dataset will be available at: https://github.com/wengzp1/GARLSR.