Unpaired Image-to-Image Translation via a Self-Supervised Semantic Bridge

📅 2026-02-18

📈 Citations: 0

✨ Influential: 0

career value

148K/year

🤖 AI Summary

Existing unpaired image translation methods exhibit limited generalization under adversarial training, while diffusion inversion approaches struggle to preserve structural fidelity. To address these challenges, this work proposes the Self-Supervised Semantic Bridge (SSB) framework, which leverages a self-supervised visual encoder to construct a shared latent space that is invariant to appearance variations yet retains geometric structure. This latent representation serves as a semantic prior for a diffusion bridge model, enabling spatially consistent and high-fidelity image translation without requiring cross-domain supervision. Experimental results demonstrate that SSB outperforms current state-of-the-art methods in medical image synthesis tasks, both in in-domain and out-of-domain settings, and further supports high-quality text-guided editing.

Technology Category

Application Category

📝 Abstract

Adversarial diffusion and diffusion-inversion methods have advanced unpaired image-to-image translation, but each faces key limitations. Adversarial approaches require target-domain adversarial loss during training, which can limit generalization to unseen data, while diffusion-inversion methods often produce low-fidelity translations due to imperfect inversion into noise-latent representations. In this work, we propose the Self-Supervised Semantic Bridge (SSB), a versatile framework that integrates external semantic priors into diffusion bridge models to enable spatially faithful translation without cross-domain supervision. Our key idea is to leverage self-supervised visual encoders to learn representations that are invariant to appearance changes but capture geometric structure, forming a shared latent space that conditions the diffusion bridges. Extensive experiments show that SSB outperforms strong prior methods for challenging medical image synthesis in both in-domain and out-of-domain settings, and extends easily to high-quality text-guided editing.

Problem

Research questions and friction points this paper is trying to address.

unpaired image-to-image translation

adversarial diffusion

diffusion-inversion

low-fidelity translation

generalization limitation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Supervised Semantic Bridge

unpaired image-to-image translation

diffusion bridge