Training-Free Identity Preservation in Stylized Image Generation Using Diffusion Models

📅 2025-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing diffusion-based style transfer methods struggle to preserve identity consistency for small or distant faces. This paper proposes a training-free style transfer framework addressing this limitation. Our method introduces two key innovations: (1) a novel “mosaic-based content image restoration” technique to enhance fine-grained content stability; and (2) a training-free content consistency loss, integrating spatial masking guidance, content-attention reweighting, and original-image information anchoring across denoising steps. Without compromising style fidelity—matching state-of-the-art performance—we achieve substantial gains in identity preservation: identity similarity improves by 32.7% for small/distant faces. Moreover, the framework exhibits strong generalization capability, requiring no fine-tuning or additional training.

Technology Category

Application Category

📝 Abstract
While diffusion models have demonstrated remarkable generative capabilities, existing style transfer techniques often struggle to maintain identity while achieving high-quality stylization. This limitation is particularly acute for images where faces are small or exhibit significant camera-to-face distances, frequently leading to inadequate identity preservation. To address this, we introduce a novel, training-free framework for identity-preserved stylized image synthesis using diffusion models. Key contributions include: (1) the"Mosaic Restored Content Image"technique, significantly enhancing identity retention, especially in complex scenes; and (2) a training-free content consistency loss that enhances the preservation of fine-grained content details by directing more attention to the original image during stylization. Our experiments reveal that the proposed approach substantially surpasses the baseline model in concurrently maintaining high stylistic fidelity and robust identity integrity, particularly under conditions of small facial regions or significant camera-to-face distances, all without necessitating model retraining or fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Maintaining identity in stylized image generation with diffusion models
Addressing identity loss in small or distant facial images
Enhancing content detail preservation without model retraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework for identity-preserved stylization
Mosaic Restored Content Image enhances identity retention
Content consistency loss preserves fine-grained details
🔎 Similar Papers
No similar papers found.