HS-SLAM: Hybrid Representation with Structural Supervision for Improved Dense SLAM

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

To address critical limitations in NeRF-SLAM—including incomplete scene representation, missing structural priors, and poor global consistency under dynamic scenes or large camera motions—this paper proposes a high-accuracy, robust real-time dense SLAM framework. Methodologically, it introduces three key innovations: (1) a novel hybrid implicit encoding scheme combining hash grids, tri-planes, and a single spherical harmonic basis to jointly capture local details and enforce global geometric constraints; (2) non-local pixel-block structural supervision to explicitly model long-range geometric consistency; and (3) an active global bundle adjustment (BA) mechanism that jointly optimizes camera poses and the neural radiance field online. Experiments on multiple challenging dynamic and large-motion sequences demonstrate that our method significantly outperforms state-of-the-art approaches: it achieves a 23.6% improvement in tracking accuracy and a 31.4% increase in reconstruction completeness, while maintaining real-time performance (>15 FPS).

Technology Category

Application Category

📝 Abstract

NeRF-based SLAM has recently achieved promising results in tracking and reconstruction. However, existing methods face challenges in providing sufficient scene representation, capturing structural information, and maintaining global consistency in scenes emerging significant movement or being forgotten. To this end, we present HS-SLAM to tackle these problems. To enhance scene representation capacity, we propose a hybrid encoding network that combines the complementary strengths of hash-grid, tri-planes, and one-blob, improving the completeness and smoothness of reconstruction. Additionally, we introduce structural supervision by sampling patches of non-local pixels rather than individual rays to better capture the scene structure. To ensure global consistency, we implement an active global bundle adjustment (BA) to eliminate camera drifts and mitigate accumulative errors. Experimental results demonstrate that HS-SLAM outperforms the baselines in tracking and reconstruction accuracy while maintaining the efficiency required for robotics.

Problem

Research questions and friction points this paper is trying to address.

Enhancing scene representation for dense SLAM

Improving structural information capture in SLAM

Maintaining global consistency in dynamic scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid encoding network combining hash-grid, tri-planes, one-blob

Structural supervision via non-local pixel patches

Active global bundle adjustment for consistency

🔎 Similar Papers

Hi-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting