Edge Prediction for Roof Wireframe Reconstruction with Transformers

📅 2026-06-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

167K/year
🤖 AI Summary
This work proposes the first end-to-end DETR-style Transformer-based method for reconstructing 3D house roof wireframes from sparse Structure-from-Motion (SfM) point clouds, ground-level semantic segmentation, and depth maps. The approach introduces a semantic-driven dynamic subsampling of the point cloud, integrating Gestalt-inspired geometric priors with ADE20k semantic features. It leverages contextual latent representations extracted by a frozen autoencoder and directly decodes 3D wireframe edges through cross-attention mechanisms. Evaluated on the HoHo 22k dataset, the method achieves a Hybrid Structural Score (HSS) of 0.6476 and ranks second on the private leaderboard of the S23DR Challenge 2026, significantly outperforming both handcrafted and learning-based baselines.
📝 Abstract
This paper presents a competitive solution to the S23DR Challenge 2026, which aims to reconstruct 3D house roof wireframe models from sparse SfM point clouds and ground-level semantic segmentations and depth maps. Our proposed method utilizes an end-to-end Transformer encoder-decoder architecture inspired by DETR. To effectively process the geometric and semantic data, the sparse SfM point cloud input is dynamically subsampled based on semantic priority and augmented with Gestalt and ADE20k class features. To further increase segmentation context, we fuse the point features with additional Gestalt feature encodings which are obtained by projecting the points into latent feature maps produced by a frozen autoencoder. Learned query embeddings are then decoded directly into 3D wireframe edges via cross-attention mechanisms. Evaluated on the "HoHo 22k" dataset, our approach significantly outperforms both handcrafted and learned baselines, achieving a Hybrid Structure Score (HSS) of 0.6476 and securing the second-highest position on the challenge's private leaderboard.
Problem

Research questions and friction points this paper is trying to address.

roof wireframe reconstruction
3D reconstruction
sparse SfM point clouds
semantic segmentation
depth maps
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer
wireframe reconstruction
semantic-aware subsampling
Gestalt features
cross-attention decoding
🔎 Similar Papers
No similar papers found.