🤖 AI Summary
In multi-object tracking (MOT), Kalman filters offer computational efficiency but struggle to model nonlinear motion, whereas data-driven predictors exhibit strong representational capacity yet suffer from poor generalization and high inference overhead. This paper proposes the first adaptive fusion framework that dynamically discriminates motion patterns via multi-sensory motion analysis and online weights the integration of Kalman filtering and data-driven prediction—without modifying existing models. Its core contribution lies in uncovering and exploiting the complementary nature of linear and nonlinear motion in real-world scenarios, thereby achieving the first adaptive unification of classical and modern motion modeling paradigms. Extensive experiments demonstrate significant improvements in HOTA and IDF1 on MOT17 and MOT20, and state-of-the-art performance on DanceTrack, validating the method’s effectiveness, generalizability, and plug-and-play compatibility.
📝 Abstract
Multi-object tracking (MOT) predominantly follows the tracking-by-detection paradigm, where Kalman filters serve as the standard motion predictor due to computational efficiency but inherently fail on non-linear motion patterns. Conversely, recent data-driven motion predictors capture complex non-linear dynamics but suffer from limited domain generalization and computational overhead. Through extensive analysis, we reveal that even in datasets dominated by non-linear motion, Kalman filter outperforms data-driven predictors in up to 34% of cases, demonstrating that real-world tracking scenarios inherently involve both linear and non-linear patterns. To leverage this complementarity, we propose PlugTrack, a novel framework that adaptively fuses Kalman filter and data-driven motion predictors through multi-perceptive motion understanding. Our approach employs multi-perceptive motion analysis to generate adaptive blending factors. PlugTrack achieves significant performance gains on MOT17/MOT20 and state-of-the-art on DanceTrack without modifying existing motion predictors. To the best of our knowledge, PlugTrack is the first framework to bridge classical and modern motion prediction paradigms through adaptive fusion in MOT.