🤖 AI Summary
To address the lack of paired ground-truth data for single-image glass reflection removal, this paper proposes the first diffusion-based framework incorporating self-supervision. Built upon denoising diffusion probabilistic models (DDPMs), the method employs a dual-branch latent-space encoder to disentangle reflection and transmission components. It further introduces a contrastive self-supervised loss and physically inspired reflection priors to eliminate reliance on manual annotations and mitigate synthetic artifacts inherent in data synthesis. Evaluated on real-world benchmarks—including Real20 and UG2+—the approach achieves state-of-the-art performance, improving PSNR by over 2.1 dB. It also demonstrates superior visual fidelity and structural consistency compared to existing supervised and unsupervised methods, exhibiting strong generalization to diverse real-world scenarios.