PE3R: Perception-Efficient 3D Reconstruction

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing 2D-to-3D scene reconstruction methods suffer from poor generalization, low perceptual fidelity, and slow inference. To address these limitations, we propose the first end-to-end feed-forward framework for open-vocabulary 3D semantic field reconstruction—eliminating iterative optimization in favor of joint 2D visual feature distillation, implicit 3D semantic field modeling, and open-vocabulary prompt alignment. Our approach enables zero-shot cross-scene generalization while achieving high geometric accuracy and fine-grained semantic segmentation at real-time inference speeds. On multi-task benchmarks, it achieves ≥9× speedup over prior methods, while attaining state-of-the-art performance in both semantic segmentation accuracy and geometric reconstruction quality. To our knowledge, this is the first method to realize efficient, precise, and generalizable 3D scene reconstruction driven by open-vocabulary semantics.

Technology Category

Application Category

📝 Abstract

Recent advancements in 2D-to-3D perception have significantly improved the understanding of 3D scenes from 2D images. However, existing methods face critical challenges, including limited generalization across scenes, suboptimal perception accuracy, and slow reconstruction speeds. To address these limitations, we propose Perception-Efficient 3D Reconstruction (PE3R), a novel framework designed to enhance both accuracy and efficiency. PE3R employs a feed-forward architecture to enable rapid 3D semantic field reconstruction. The framework demonstrates robust zero-shot generalization across diverse scenes and objects while significantly improving reconstruction speed. Extensive experiments on 2D-to-3D open-vocabulary segmentation and 3D reconstruction validate the effectiveness and versatility of PE3R. The framework achieves a minimum 9-fold speedup in 3D semantic field reconstruction, along with substantial gains in perception accuracy and reconstruction precision, setting new benchmarks in the field. The code is publicly available at: https://github.com/hujiecpp/PE3R.

Problem

Research questions and friction points this paper is trying to address.

Improves 3D scene understanding from 2D images

Addresses limited generalization and slow reconstruction speeds

Enhances accuracy and efficiency in 3D reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Feed-forward architecture for rapid 3D reconstruction

Robust zero-shot generalization across diverse scenes

9-fold speedup in 3D semantic field reconstruction

🔎 Similar Papers

No similar papers found.

Authors to Follow