IPDRecon: Image-Plane Geometric Decoding for View-Invariant Indoor Scene Reconstruction

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing volumetric indoor reconstruction methods heavily rely on multi-view geometric constraints, leading to severe performance degradation under sparse-view conditions and occluded regions. To address this, we propose an image-plane geometric decoding framework that bypasses inter-view ray intersection and instead extracts spatial structure information from single views. Our key contributions are: (1) pixel-wise confidence encoding (PCE), which quantifies observation reliability; (2) an affine compensation module (ACM) to correct imaging distortions; and (3) an image-plane spatial decoder (IPSD) that incorporates physical imaging priors for high-fidelity 2D-to-3D mapping. Evaluated on ScanNetV2, our method maintains nearly identical reconstruction quality when the number of input views is reduced by 40%, with a coefficient of variation of only 0.24% and 99.7% performance retention. This significantly enhances view invariance and robustness to occlusions and boundary regions.

Technology Category

Application Category

📝 Abstract
Volume-based indoor scene reconstruction methods demonstrate significant research value due to their superior generalization capability and real-time deployment potential. However, existing methods rely on multi-view pixel back-projection ray intersections as weak geometric constraints to determine spatial positions, causing reconstruction quality to depend heavily on input view density with poor performance in overlapping regions and unobserved areas. To address these issues, the key lies in reducing dependency on inter-view geometric constraints while exploiting rich spatial information within individual views. We propose IPDRecon, an image-plane decoding framework comprising three core components: Pixel-level Confidence Encoder (PCE), Affine Compensation Module (ACM), and Image-Plane Spatial Decoder (IPSD). These modules collaboratively decode 3D structural information encoded in 2D images through physical imaging processes, effectively preserving spatial geometric features including edges, hollow structures, and complex textures while significantly enhancing view-invariant reconstruction. Experiments on ScanNetV2 confirm that IPDRecon achieves superior reconstruction stability, maintaining nearly identical quality when view count reduces by 40%. The method achieves a coefficient of variation of only 0.24%, performance retention rate of 99.7%, and maximum performance drop of merely 0.42%. This demonstrates that exploiting intra-view spatial information provides a robust solution for view-limited scenarios in practical applications.
Problem

Research questions and friction points this paper is trying to address.

Reduces dependency on multi-view geometric constraints
Enhances reconstruction in overlapping regions and unobserved areas
Decodes 3D structural information from 2D images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Image-plane decoding framework for 3D reconstruction
Decodes 3D structure from 2D images via imaging physics
Maintains reconstruction quality with reduced view count
M
Mingyang Li
School of Microelectronics, Tianjin University, Tianjin, China
Y
Yimeng Fan
School of Microelectronics, Tianjin University, Tianjin, China
C
Changsong Liu
School of Microelectronics, Tianjin University, Tianjin, China
Tianyu Zhou
Tianyu Zhou
PhD Student, Purdue University
RoboticHuman-Robot TeamingAutonomous SystemOptimizationAI & Control System
X
Xin Wang
School of Microelectronics, Tianjin University, Tianjin, China
Y
Yanyan Liu
College of Electronic Information and Optical Engineering, Nankai University, Tianjin, China
W
Wei Zhang
School of Microelectronics, Tianjin University, Tianjin, China