No Train Yet Gain: Towards Generic Multi-Object Tracking in Sports and Beyond

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-object tracking (MOT) in sports scenes faces challenges including rapid motion, frequent occlusions, and camera motion. Conventional detection-driven methods suffer from poor generalization, while segmentation-driven approaches struggle to model temporal trajectories effectively. This paper proposes McByte, a training-free MOT framework that introduces a novel paradigm eliminating the need for video-level fine-tuning or end-to-end training. McByte synergistically integrates temporally propagated masks—generated by pre-trained segmentation models (e.g., SAM or Mask R-CNN)—as strong association cues, and couples them with YOLO-based detectors within a detection-tracking architecture. This design significantly enhances robustness against motion blur, occlusion, and camera motion. McByte achieves state-of-the-art or near-state-of-the-art performance on SportsMOT, DanceTrack, SoccerNet-tracking 2022, and MOT17, demonstrating strong cross-domain generalization and architectural versatility.

Technology Category

Application Category

📝 Abstract
Multi-object tracking (MOT) is essential for sports analytics, enabling performance evaluation and tactical insights. However, tracking in sports is challenging due to fast movements, occlusions, and camera shifts. Traditional tracking-by-detection methods require extensive tuning, while segmentation-based approaches struggle with track processing. We propose McByte, a tracking-by-detection framework that integrates temporally propagated segmentation mask as an association cue to improve robustness without per-video tuning. Unlike many existing methods, McByte does not require training, relying solely on pre-trained models and object detectors commonly used in the community. Evaluated on SportsMOT, DanceTrack, SoccerNet-tracking 2022 and MOT17, McByte demonstrates strong performance across sports and general pedestrian tracking. Our results highlight the benefits of mask propagation for a more adaptable and generalizable MOT approach. Code will be made available at https://github.com/tstanczyk95/McByte.
Problem

Research questions and friction points this paper is trying to address.

Improving multi-object tracking robustness in sports without per-video tuning
Eliminating the need for training in tracking-by-detection frameworks
Enhancing tracking accuracy across sports and general scenarios using mask propagation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses mask propagation for robust tracking
No training, relies on pre-trained models
Integrates segmentation masks as association cues
🔎 Similar Papers
No similar papers found.
T
Tomasz Stanczyk
Inria, France; Université Côte d’Azur, France
Seongro Yoon
Seongro Yoon
INRIA, France
Computer visionMachine learning
F
Francois Bremond
Inria, France; Université Côte d’Azur, France