🤖 AI Summary
Existing text-to-3D generation methods struggle to model the semantic structure of object parts and their alignment with textual descriptions. To address this limitation, this work proposes DreamPartGen, a novel framework that achieves language-driven, semantics-aware part-level 3D generation for the first time. The method introduces dual-path latent variables (DPLs) for individual parts and relational semantic latents (RSLs) to capture inter-part dependencies, which are jointly optimized through a synchronized co-denoising mechanism. This approach ensures high-fidelity generation in both geometric detail and semantic consistency by simultaneously refining part geometry, appearance, and semantic relationships. Extensive experiments demonstrate that DreamPartGen significantly outperforms current state-of-the-art methods across multiple benchmarks, achieving superior performance in text-shape alignment and part interpretability.
📝 Abstract
Understanding and generating 3D objects as compositions of meaningful parts is fundamental to human perception and reasoning. However, most text-to-3D methods overlook the semantic and functional structure of parts. While recent part-aware approaches introduce decomposition, they remain largely geometry-focused, lacking semantic grounding and failing to model how parts align with textual descriptions or their inter-part relations. We propose DreamPartGen, a framework for semantically grounded, part-aware text-to-3D generation. DreamPartGen introduces Duplex Part Latents (DPLs) that jointly model each part's geometry and appearance, and Relational Semantic Latents (RSLs) that capture inter-part dependencies derived from language. A synchronized co-denoising process enforces mutual geometric and semantic consistency, enabling coherent, interpretable, and text-aligned 3D synthesis. Across multiple benchmarks, DreamPartGen delivers state-of-the-art performance in geometric fidelity and text-shape alignment.