🤖 AI Summary
This work addresses the challenges of identity inconsistency and degraded segmentation accuracy in multi-object tracking and segmentation (MOTS), which stem from unreliable trajectory association and error propagation from false detections. To tackle these issues, the paper proposes a zero-shot MOTS approach that integrates the SAM2 instance segmentation model with a novel trajectory management module. The method introduces Mask Centroid Distance (MCD) and Confidence-aware Cost Modulation (CCM) to refine data association and incorporates a Bernoulli filter–based probabilistic trajectory verification mechanism to enhance temporal consistency. Without requiring any fine-tuning, the proposed framework effectively suppresses spurious trajectories and improves identity preservation, achieving significant performance gains in both tracking and segmentation on the KITTI MOTS benchmark.
📝 Abstract
Autonomous systems require robust Multi-Object Tracking and Segmentation (MOTS) to operate reliably in dynamic environments, ensuring consistent object identities and precise mask-level delineation. Foundation models such as SAM2 have shown strong zero-shot generalization for segmentation, but their direct application to MOTS is limited by unreliable track association and false-positive propagation. This work introduces Seg2Track++, a framework that integrates instance segmentation with SAM2 and a novel track management module to perform zero-shot MOTS with enhanced temporal consistency. Tracks are associated using Mask Centroid Distance (MCD) and Confidence-Aware Cost Modulation (CCM), while Probabilistic Track Validation (PTV) employs a Bernoulli filter to validate track existence and suppress ghost tracks. Experimental results on KITTI MOTS demonstrate improved identity preservation, reduced false-positive propagation, and robust track management without fine-tuning.