🤖 AI Summary
This paper addresses core challenges in LiDAR point-cloud-based 3D multi-object tracking—namely, false detections of distant/occluded objects, trajectory drift, and severe ID switches—by proposing an online robust tracking framework. Methodologically, it introduces a novel trajectory validity discrimination mechanism and a multi-stage observation gating strategy to suppress observational noise, alongside a modified Kalman filter for enhanced state estimation robustness. Its key contribution lies in systematically mitigating ghost trajectories and identity confusion inherent in detection-driven tracking pipelines. Experiments demonstrate significant improvements: on the KITTI validation set, MOTA increases by 29.47% and HOTA by up to 8.7%; on the Waymo Open Dataset, MOTA improves by 1.77%. The framework achieves real-time performance at 3,221 FPS on a single CPU, balancing high accuracy with computational efficiency.
📝 Abstract
This paper addresses limitations in 3D tracking-by-detection methods, particularly in identifying legitimate trajectories and reducing state estimation drift in Kalman filters. Existing methods often use threshold-based filtering for detection scores, which can fail for distant and occluded objects, leading to false positives. To tackle this, we propose a novel track validity mechanism and multi-stage observational gating process, significantly reducing ghost tracks and enhancing tracking performance. Our method achieves a $29.47%$ improvement in Multi-Object Tracking Accuracy (MOTA) on the KITTI validation dataset with the Second detector. Additionally, a refined Kalman filter term reduces localization noise, improving higher-order tracking accuracy (HOTA) by $4.8%$. The online framework, RobMOT, outperforms state-of-the-art methods across multiple detectors, with HOTA improvements of up to $3.92%$ on the KITTI testing dataset and $8.7%$ on the validation dataset, while achieving low identity switch scores. RobMOT excels in challenging scenarios, tracking distant objects and prolonged occlusions, with a $1.77%$ MOTA improvement on the Waymo Open dataset, and operates at a remarkable 3221 FPS on a single CPU, proving its efficiency for real-time multi-object tracking.