🤖 AI Summary
Existing Kalman filter–based 3D multi-object tracking methods typically employ uniform process and observation noise covariances, overlooking the distinct motion and observation uncertainties across different traffic participants. This work proposes a category-aware, object-frame-aligned noise modeling framework that learns diagonal covariance matrices separately for each object class, explicitly preserving longitudinal–lateral anisotropy. To our knowledge, this is the first approach in Kalman filter–based 3D MOT to incorporate category-specific, coordinate-aligned noise modeling, which significantly improves tracking accuracy, reduces identity switches, and enhances uncertainty calibration. Evaluated on the nuScenes benchmark, the method outperforms current state-of-the-art approaches while revealing a prevalent overconfidence issue in existing Kalman filter trackers.
📝 Abstract
Kalman filter (KF)-based multi-object tracking (MOT) remains a strong baseline for autonomous driving due to its strong performance, computational efficiency and interpretability. In most practical systems, the process noise and measurement noise covariances are defined globally and shared across object classes, presuming identical uncertainty characteristics across heterogeneous traffic participants.
This work revisits this assumption and proposes CANMOT, a class-aware and object-aligned noise modeling framework for KF-based 3D MOT. Class-specific diagonal process and measurement covariance matrices are introduced and optionally expressed in the object coordinate frame to preserve longitudinal-lateral anisotropy.
Systematic experiments on the nuScenes benchmark show that class-aware and object-aligned noise modeling improves tracking performance and substantially reduces identity switches compared to state-of-the-art (SotA). In addition, the consistency of the estimated uncertainty is analyzed using the Average Normalized Estimation Error Squared (ANEES) and $χ^2$-based violation tests. The results reveal severe overconfidence in standard KF-based MOT baselines. While the proposed formulation improves calibration without modifying the underlying filtering framework, it still exhibits substantial inconsistency, highlighting the need for further research in this area.
Code is available at https://github.com/rst-tu-dortmund/learned-3d-nms.