🤖 AI Summary
Existing 3D layout estimation methods rely on single-floor synthetic data and fail to model inter-floor structures (e.g., staircases), as multi-floor buildings are artificially partitioned into isolated floors—eroding global spatial context.
Method: We introduce the first real-world, multi-floor 3D layout benchmark, encompassing diverse architectural topologies and vertical connectivity patterns. Building upon it, we propose MultiFloor3D—a training-free baseline that integrates geometric priors and spatial reasoning to explicitly encode inter-floor spatial relationships while preserving building-scale contextual coherence.
Contribution/Results: MultiFloor3D achieves state-of-the-art performance on our new benchmark and multiple established datasets—including Structured3D, LSUN, and Stanford2D3D—outperforming supervised methods. This demonstrates the efficacy of training-free paradigms for multi-floor layout estimation and establishes both a novel benchmark and a foundational framework for building-scale structural understanding.
📝 Abstract
Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. Data and code are available at: https://houselayout3d.github.io.