🤖 AI Summary
Existing automated furniture layout methods are hindered by the scarcity of realistic, professionally designed floorplans annotated at the object level. This work proposes an editable automatic layout framework that leverages a newly curated dataset, AntPlan-270, and integrates a coordinate-based domain-specific language (DSL) with procedural reasoning trajectories to unify symbolic layout generation and photorealistic rendering. By fine-tuning vision-language models for spatial reasoning and incorporating preference optimization to refine layout quality, the approach produces geometrically valid and functionally plausible arrangements. The system further supports semantic mask export and blueprint-style visualization, offering a scalable and structured pathway toward automated interior design for large-scale floorplan datasets.
📝 Abstract
Furnished floor plans are fundamental to real estate visualization, interior design, and architectural workflows. However, progress in automatic furniture arrangement has been limited by the lack of real, professionally designed floor-plan datasets with object-level furniture annotations. To address this gap, we introduce AntPlan-270, a curated dataset of 270 architectural floor plans with per-room furniture bounding box annotations across ten residential room categories. Building on this dataset, we present Architect-Ant, an editable automatic furnishing framework powered by a fine-tuned vision-language model. Furniture layouts are represented using a compact, coordinate-based domain-specific language (DSL) that encodes object categories and placements relative to the room geometry. To improve spatial reasoning, we generate procedural reasoning traces that capture architectural constraints such as wall alignment, door and window clearance, circulation, fixture compatibility, and room-specific furniture inventories, and use them to supervise fine-tuning of the model. We then apply preference optimization over candidate object placements to further refine layout quality. The generated DSL can be rasterized into semantic masks and used to condition a Flux-based LoRA renderer, producing realistic blueprint-style furnished floor-plan images while preserving the editable symbolic layout. Experiments on layout furnishing show that Architect-Ant produces geometrically valid and functionally plausible layouts, and suggest a scalable path for furnishing larger structure-only floor-plan datasets.