Create Anything Anywhere: Layout-Controllable Personalized Diffusion Model for Multiple Subjects

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing personalized diffusion models struggle to simultaneously achieve precise spatial layout control and faithful preservation of multiple subjects’ identities. This paper proposes a fine-tuning-free framework for multi-subject personalized image generation, introducing two key innovations: (1) a dynamic-static complementary visual refinement module that captures temporal or pose variations in reference images via dynamic feature modeling, while distilling static identity cues and leveraging cross-attention guidance; and (2) a two-stage layout control mechanism imposing explicit spatial constraints during both training and inference. Together, these enable pixel-level layout controllability and consistent cross-subject identity preservation. Extensive evaluations across multiple benchmarks demonstrate significant improvements: +23.6% in ID-Retrieval accuracy and +31.4% in IoU. The framework supports arbitrary subject identity specification and customizable spatial layouts even in complex scenes.

Technology Category

Application Category

📝 Abstract
Diffusion models have significantly advanced text-to-image generation, laying the foundation for the development of personalized generative frameworks. However, existing methods lack precise layout controllability and overlook the potential of dynamic features of reference subjects in improving fidelity. In this work, we propose Layout-Controllable Personalized Diffusion (LCP-Diffusion) model, a novel framework that integrates subject identity preservation with flexible layout guidance in a tuning-free approach. Our model employs a Dynamic-Static Complementary Visual Refining module to comprehensively capture the intricate details of reference subjects, and introduces a Dual Layout Control mechanism to enforce robust spatial control across both training and inference stages. Extensive experiments validate that LCP-Diffusion excels in both identity preservation and layout controllability. To the best of our knowledge, this is a pioneering work enabling users to"create anything anywhere".
Problem

Research questions and friction points this paper is trying to address.

Lack precise layout controllability in personalized diffusion models
Overlook dynamic features of reference subjects for fidelity improvement
Need tuning-free integration of identity preservation and layout guidance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tuning-free personalized diffusion model
Dynamic-Static Complementary Visual Refining
Dual Layout Control mechanism
🔎 Similar Papers
No similar papers found.
W
Wei Li
MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, China
Hebei Li
Hebei Li
PhD of USTC
Event cameraNeuromorphic3D
Yansong Peng
Yansong Peng
University of Science and Technology of China
AIAIGCComputer VisionObject Detection
S
Siying Wu
Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
Yueyi Zhang
Yueyi Zhang
Miromind, Previously University of Science and Technology of China
Structured lightDepth SensingEvent CameraMedical Imaging
Xiaoyan Sun
Xiaoyan Sun
Microsoft Research Asia
Image/Video CodingMultimedia ProcessingComputer Vision