Extracting polygonal footprints in off-nadir images with Segment Anything Model

πŸ“… 2024-08-16
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address low accuracy and poor generalization in building footprint extraction from oblique remote sensing imagery, this paper proposes an end-to-end promptable framework for direct polygonal footprint prediction, abandoning the conventional segmentation-plus-postprocessing paradigm. Key contributions include: (1) a Self-Offset Attention (SOFA) mechanism that explicitly models geometric distortions under oblique viewing angles; (2) a Multi-level Information Fusion System (MISS) enabling scale-robust modelingβ€”from single-story buildings to skyscrapers; and (3) a promptable learning and multi-source mask joint modeling framework built upon the SAM architecture. The method directly outputs high-fidelity vectorized building contours without postprocessing. Extensive experiments on BONAI, OmniCity-view3, and Huizhou datasets demonstrate substantial improvements over state-of-the-art methods, achieving superior accuracy, strong cross-scene generalization, and practical deployability.

Technology Category

Application Category

πŸ“ Abstract
Building Footprint Extraction (BFE) from off-nadir aerial images often involves roof segmentation and offset prediction to adjust roof boundaries to the building footprint. However, this multi-stage approach typically produces low-quality results, limiting its applicability in real-world data production. To address this issue, we present OBMv2, an end-to-end and promptable model for polygonal footprint prediction. Unlike its predecessor OBM, OBMv2 introduces a novel Self Offset Attention (SOFA) mechanism that improves performance across diverse building types, from bungalows to skyscrapers, enabling end-to-end footprint prediction without post-processing. Additionally, we propose a Multi-level Information System (MISS) to effectively leverage roof masks, building masks, and offsets for accurate footprint prediction. We evaluate OBMv2 on the BONAI and OmniCity-view3 datasets and demonstrate its generalization on the Huizhou test set. The code will be available at https://github.com/likaiucas/OBMv2.
Problem

Research questions and friction points this paper is trying to address.

Extracting precise polygonal building footprints from off-nadir images
Overcoming geometric complexities in off-nadir viewing angles
Improving boundary accuracy without external post-processing steps
Innovation

Methods, ideas, or system contributions that make the work stand out.

Direct polygonal output without post-processing
High-Quality Mask Prompter for precise roofs
Self Offset Attention for accuracy improvement
πŸ”Ž Similar Papers
No similar papers found.
K
Kai Li
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China; School of Electronic Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
J
Jingbo Chen
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
Yupeng Deng
Yupeng Deng
aircas
Y
Yu Meng
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
D
Diyou Liu
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
J
Junxian Ma
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China; School of Electronic Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
Chenhao Wang
Chenhao Wang
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China; School of Electronic Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China