ActMVS: Active Scene Reconstruction with Monocular Multi-View Stereo

📅 2026-05-31
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
Existing active scene reconstruction methods rely on depth sensors, which struggle to meet the demands of robots and UAVs for low-cost, high-frame-rate, and globally consistent dense depth maps. This work proposes the first monocular active reconstruction framework that constructs a view-factor graph to guide multi-view stereo matching and integrates global depth optimization to generate high-quality, globally consistent dense depth maps in real time. Requiring only a monocular camera, the method enables online active reconstruction without dependence on depth sensors or offline processing. Evaluated on the Replica dataset, it achieves reconstruction performance comparable to RGB-D approaches and supports monocular robots or UAVs in building reliable occupancy maps for safe navigation in real time.
📝 Abstract
Active scene reconstruction enables robots/UAVs to autonomously plan trajectories and reconstruct environments without costly manual data acquisition. Unlike passive methods, active reconstruction requires real-time construction of high-confidence occupancy maps for collision-free navigation. Existing approaches rely on depth sensors for occupancy map updates, increasing platform cost and weight. To advance spatial intelligence, we aim for a vision-only monocular solution. However, current monocular scene reconstruction methods operate offline and fail to deliver globally consistent dense depth at the frame rates required for robots/UAVs navigation. To bridge this gap, we introduce ActMVS, the first framework for monocular active reconstruction. Our framework integrates a view factor graph construction for informed Multi-View Stereo depth prediction, along with a global depth optimization, to enable the online generation of high-quality, globally consistent dense depth maps. This enables monocular robots/UAVs to maintain reliable occupancy maps for safe trajectory planning during reconstruction. Experiments on Replica datasets demonstrate performance competitive with RGB-D methods. Our code and data are available at https://github.com/TrickyGo/ActMVS.
Problem

Research questions and friction points this paper is trying to address.

active reconstruction
monocular
dense depth
real-time
occupancy mapping
Innovation

Methods, ideas, or system contributions that make the work stand out.

monocular active reconstruction
multi-view stereo
factor graph
global depth optimization
online dense depth