🤖 AI Summary
To address high computational complexity, severe temporal redundancy, and insufficient geometric prior exploitation in LiDAR point cloud single-object tracking under highly time-varying (HTV) outdoor scenarios, this paper proposes an efficient tracking framework grounded in state-space modeling. Methodologically, it introduces the Mamba architecture to enable near-linear-complexity inter-frame information propagation, and designs a grouped feature enhancement module that decouples foreground and background via channel-wise semantic separation—thereby suppressing memory redundancy and strengthening geometric prior modeling. Evaluated on KITTI-HTV and nuScenes-HTV benchmarks, the method significantly outperforms HVTrack, achieving +6.5% success rate and +9.5% precision gains, while maintaining state-of-the-art performance on standard KITTI.
📝 Abstract
Dynamic outdoor environments with high temporal variation (HTV) pose significant challenges for 3D single object tracking in LiDAR point clouds. Existing memory-based trackers often suffer from quadratic computational complexity, temporal redundancy, and insufficient exploitation of geometric priors. To address these issues, we propose MambaTrack3D, a novel HTV-oriented tracking framework built upon the state space model Mamba. Specifically, we design a Mamba-based Inter-frame Propagation (MIP) module that replaces conventional single-frame feature extraction with efficient inter-frame propagation, achieving near-linear complexity while explicitly modeling spatial relations across historical frames. Furthermore, a Grouped Feature Enhancement Module (GFEM) is introduced to separate foreground and background semantics at the channel level, thereby mitigating temporal redundancy in the memory bank. Extensive experiments on KITTI-HTV and nuScenes-HTV benchmarks demonstrate that MambaTrack3D consistently outperforms both HTV-oriented and normal-scenario trackers, achieving improvements of up to 6.5 success and 9.5 precision over HVTrack under moderate temporal gaps. On the standard KITTI dataset, MambaTrack3D remains highly competitive with state-of-the-art normal-scenario trackers, confirming its strong generalization ability. Overall, MambaTrack3D achieves a superior accuracy-efficiency trade-off, delivering robust performance across both specialized HTV and conventional tracking scenarios.