🤖 AI Summary
To address critical challenges in indoor design generation—including inaccurate spatial scaling, weak stylistic controllability, and multi-view inconsistency—this paper proposes a controllable diffusion model grounded in meta-priors. Methodologically: (1) we introduce a novel decoupled cross-attention mechanism enabling independent, fine-grained control over appearance, pose, and dimensional attributes; (2) we incorporate an optimal transport–driven view-alignment module to ensure geometric consistency across multiple viewpoints; and (3) we construct DesignHelper, the first fine-grained indoor design dataset covering 15+ spatial types × 15+ styles, and perform domain-specific fine-tuning on a 2D pre-trained diffusion model. Experiments demonstrate state-of-the-art performance in spatial accuracy, style fidelity, and multi-view consistency, significantly enhancing industrial-grade design iteration efficiency.
📝 Abstract
Interior design is a complex and creative discipline involving aesthetics, functionality, ergonomics, and materials science. Effective solutions must meet diverse requirements, typically producing multiple deliverables such as renderings and design drawings from various perspectives. Consequently, interior design processes are often inefficient and demand significant creativity. With advances in machine learning, generative models have emerged as a promising means of improving efficiency by creating designs from text descriptions or sketches. However, few generative works focus on interior design, leading to substantial discrepancies between outputs and practical needs, such as differences in size, spatial scope, and the lack of controllable generation quality. To address these challenges, we propose DiffDesign, a controllable diffusion model with meta priors for efficient interior design generation. Specifically, we utilize the generative priors of a 2D diffusion model pre-trained on a large image dataset as our rendering backbone. We further guide the denoising process by disentangling cross-attention control over design attributes, such as appearance, pose, and size, and introduce an optimal transfer-based alignment module to enforce view consistency. Simultaneously, we construct an interior design-specific dataset, DesignHelper, consisting of over 400 solutions across more than 15 spatial types and 15 design styles. This dataset helps fine-tune DiffDesign. Extensive experiments conducted on various benchmark datasets demonstrate the effectiveness and robustness of DiffDesign.