Edge Prediction for Roof Wireframe Reconstruction with Transformers

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This work proposes the first end-to-end DETR-style Transformer-based method for reconstructing 3D house roof wireframes from sparse Structure-from-Motion (SfM) point clouds, ground-level semantic segmentation, and depth maps. The approach introduces a semantic-driven dynamic subsampling of the point cloud, integrating Gestalt-inspired geometric priors with ADE20k semantic features. It leverages contextual latent representations extracted by a frozen autoencoder and directly decodes 3D wireframe edges through cross-attention mechanisms. Evaluated on the HoHo 22k dataset, the method achieves a Hybrid Structural Score (HSS) of 0.6476 and ranks second on the private leaderboard of the S23DR Challenge 2026, significantly outperforming both handcrafted and learning-based baselines.

📝 Abstract

This paper presents a competitive solution to the S23DR Challenge 2026, which aims to reconstruct 3D house roof wireframe models from sparse SfM point clouds and ground-level semantic segmentations and depth maps. Our proposed method utilizes an end-to-end Transformer encoder-decoder architecture inspired by DETR. To effectively process the geometric and semantic data, the sparse SfM point cloud input is dynamically subsampled based on semantic priority and augmented with Gestalt and ADE20k class features. To further increase segmentation context, we fuse the point features with additional Gestalt feature encodings which are obtained by projecting the points into latent feature maps produced by a frozen autoencoder. Learned query embeddings are then decoded directly into 3D wireframe edges via cross-attention mechanisms. Evaluated on the "HoHo 22k" dataset, our approach significantly outperforms both handcrafted and learned baselines, achieving a Hybrid Structure Score (HSS) of 0.6476 and securing the second-highest position on the challenge's private leaderboard.

Problem

Research questions and friction points this paper is trying to address.

roof wireframe reconstruction

3D reconstruction

sparse SfM point clouds

semantic segmentation

depth maps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer

wireframe reconstruction

semantic-aware subsampling