MambaTrack3D: A State Space Model Framework for LiDAR-Based Object Tracking under High Temporal Variation

📅 2025-11-19

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

To address high computational complexity, severe temporal redundancy, and insufficient geometric prior exploitation in LiDAR point cloud single-object tracking under highly time-varying (HTV) outdoor scenarios, this paper proposes an efficient tracking framework grounded in state-space modeling. Methodologically, it introduces the Mamba architecture to enable near-linear-complexity inter-frame information propagation, and designs a grouped feature enhancement module that decouples foreground and background via channel-wise semantic separation—thereby suppressing memory redundancy and strengthening geometric prior modeling. Evaluated on KITTI-HTV and nuScenes-HTV benchmarks, the method significantly outperforms HVTrack, achieving +6.5% success rate and +9.5% precision gains, while maintaining state-of-the-art performance on standard KITTI.

Technology Category

Application Category

📝 Abstract

Dynamic outdoor environments with high temporal variation (HTV) pose significant challenges for 3D single object tracking in LiDAR point clouds. Existing memory-based trackers often suffer from quadratic computational complexity, temporal redundancy, and insufficient exploitation of geometric priors. To address these issues, we propose MambaTrack3D, a novel HTV-oriented tracking framework built upon the state space model Mamba. Specifically, we design a Mamba-based Inter-frame Propagation (MIP) module that replaces conventional single-frame feature extraction with efficient inter-frame propagation, achieving near-linear complexity while explicitly modeling spatial relations across historical frames. Furthermore, a Grouped Feature Enhancement Module (GFEM) is introduced to separate foreground and background semantics at the channel level, thereby mitigating temporal redundancy in the memory bank. Extensive experiments on KITTI-HTV and nuScenes-HTV benchmarks demonstrate that MambaTrack3D consistently outperforms both HTV-oriented and normal-scenario trackers, achieving improvements of up to 6.5 success and 9.5 precision over HVTrack under moderate temporal gaps. On the standard KITTI dataset, MambaTrack3D remains highly competitive with state-of-the-art normal-scenario trackers, confirming its strong generalization ability. Overall, MambaTrack3D achieves a superior accuracy-efficiency trade-off, delivering robust performance across both specialized HTV and conventional tracking scenarios.

Problem

Research questions and friction points this paper is trying to address.

Addresses 3D object tracking challenges in LiDAR point clouds under high temporal variation

Overcomes quadratic complexity and temporal redundancy in memory-based trackers

Improves geometric prior utilization while maintaining near-linear computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Mamba state space model for inter-frame propagation

Separates foreground and background via grouped feature enhancement

Achieves near-linear complexity with spatial relation modeling

🔎 Similar Papers

No similar papers found.