MSP-MVS: Multi-granularity Segmentation Prior Guided Multi-View Stereo

📅 2024-07-27
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address reconstruction distortion and edge discontinuities in textureless regions for multi-view stereo (MVS), this paper proposes an edge-constrained patch deformation method guided by multi-granularity segmentation priors. The method tackles two key challenges: (1) semantic-driven multi-granularity depth-edge priors enforce precise edge constraints within depth-continuous domains, enabling robust deformation of textureless regions; and (2) an anchor-adaptive balanced clustering mechanism coupled with disparity-aware 3D cost optimization mitigates attention imbalance and local optima caused by fixed sampling. The framework integrates Semantic-SAM–based semantic segmentation, multi-scale edge aggregation and refinement, and dynamic anchor redistribution with decoupled clustering. Evaluated on ETH3D and Tanks & Temples benchmarks, the approach achieves state-of-the-art performance, significantly improving reconstruction accuracy in textureless areas and enhancing cross-scene generalization capability.

Technology Category

Application Category

📝 Abstract
Recently, patch deformation-based methods have demonstrated significant strength in multi-view stereo by adaptively expanding the reception field of patches to help reconstruct textureless areas. However, such methods mainly concentrate on searching for pixels without matching ambiguity (i.e., reliable pixels) when constructing deformed patches, while neglecting the deformation instability caused by unexpected edge-skipping, resulting in potential matching distortions. Addressing this, we propose MSP-MVS, a method introducing multi-granularity segmentation prior for edge-confined patch deformation. Specifically, to avoid unexpected edge-skipping, we first aggregate and further refine multi-granularity depth edges gained from Semantic-SAM as prior to guide patch deformation within depth-continuous (i.e., homogeneous) areas. Moreover, to address attention imbalance caused by edge-confined patch deformation, we implement adaptive equidistribution and disassemble-clustering of correlative reliable pixels (i.e., anchors), thereby promoting attention-consistent patch deformation. Finally, to prevent deformed patches from falling into local-minimum matching costs caused by the fixed sampling pattern, we introduce disparity-sampling synergistic 3D optimization to help identify global-minimum matching costs. Evaluations on ETH3D and Tanks&Temples benchmarks prove our method obtains state-of-the-art performance with remarkable generalization.
Problem

Research questions and friction points this paper is trying to address.

Addresses deformation instability in multi-view stereo reconstruction.
Introduces multi-granularity segmentation to guide patch deformation.
Optimizes patch deformation to avoid local-minimum matching costs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-granularity segmentation prior guides patch deformation
Adaptive equidistribution balances attention in patch deformation
Disparity-sampling synergistic 3D optimization prevents local-minimum costs
🔎 Similar Papers
No similar papers found.
Z
Zhenlong Yuan
Institute of Computing Technology, Chinese Academy of Sciences
C
Cong Liu
Peng Cheng Laboratory
Fei Shen
Fei Shen
National University of Singapore
Controllable GenerationMultimodal Safety
Zhaoxin Li
Zhaoxin Li
Georgia Institute of Technology
Robot LearningExplainable Artificial Intelligence
T
Tianlu Mao
Institute of Computing Technology, Chinese Academy of Sciences
Z
Zhaoqi Wang
Institute of Computing Technology, Chinese Academy of Sciences