🤖 AI Summary
To address incomplete perception in single-agent 3D multi-object tracking (3D MOT) for autonomous driving—caused by occlusions and sensor failures—this paper proposes the first graph signal processing (GSP)-based collaborative multi-vehicle tracking framework. We construct a fully connected graph to model geometric and motion coherence among detection bounding boxes across vehicles, and introduce graph Laplacian regularization to jointly smooth detection errors and enhance cross-vehicle consistency during both localization refinement and trajectory association. This work is the first to systematically integrate the GSP paradigm into collaborative MOT, enabling end-to-end fusion of multi-source LiDAR detections. Evaluated on the real-world V2V4Real dataset, our method achieves +4.2% MOTA and +5.8% IDF1 over state-of-the-art approaches such as DMSTrack, demonstrating significant performance gains in robustness and accuracy.
📝 Abstract
Multi-Object Tracking (MOT) plays a crucial role in autonomous driving systems, as it lays the foundations for advanced perception and precise path planning modules. Nonetheless, single agent based MOT lacks in sensing surroundings due to occlusions, sensors failures, etc. Hence, the integration of multiagent information is essential for comprehensive understanding of the environment. This paper proposes a novel Cooperative MOT framework for tracking objects in 3D LiDAR scene by formulating and solving a graph topology-aware optimization problem so as to fuse information coming from multiple vehicles. By exploiting a fully connected graph topology defined by the detected bounding boxes, we employ the Graph Laplacian processing optimization technique to smooth the position error of bounding boxes and effectively combine them. In that manner, we reveal and leverage inherent coherences of diverse multi-agent detections, and associate the refined bounding boxes to tracked objects at two stages, optimizing localization and tracking accuracies. An extensive evaluation study has been conducted, using the real-world V2V4Real dataset, where the proposed method significantly outperforms the baseline frameworks, including the state-of-the-art deep-learning DMSTrack and V2V4Real, in various testing sequences.