🤖 AI Summary
Existing multi-object tracking (MOT) methods are designed for pinhole cameras and suffer significant performance degradation on omnidirectional images due to severe geometric distortion, non-uniform resolution, and illumination variation. To address this, we propose OmniTrack—the first end-to-end MOT framework tailored for fisheye/omnidirectional imagery. It comprises three core components: (1) Tracklet temporal management for robust trajectory maintenance; (2) FlexiTrack, a deformable instance modeling and association module; and (3) CircularStatE, a circular-geometry-aware statistical distortion correction mechanism. Furthermore, we introduce QuadTrack—a novel omnidirectional MOT benchmark captured by quadruped robots—filling a critical gap in the community. OmniTrack achieves state-of-the-art HOTA scores of 26.92% (+3.43%) on JRDB and 23.45% (+6.81%) on QuadTrack. Both code and dataset are publicly released.
📝 Abstract
Panoramic imagery, with its 360{deg} field of view, offers comprehensive information to support Multi-Object Tracking (MOT) in capturing spatial and temporal relationships of surrounding objects. However, most MOT algorithms are tailored for pinhole images with limited views, impairing their effectiveness in panoramic settings. Additionally, panoramic image distortions, such as resolution loss, geometric deformation, and uneven lighting, hinder direct adaptation of existing MOT methods, leading to significant performance degradation. To address these challenges, we propose OmniTrack, an omnidirectional MOT framework that incorporates Tracklet Management to introduce temporal cues, FlexiTrack Instances for object localization and association, and the CircularStatE Module to alleviate image and geometric distortions. This integration enables tracking in large field-of-view scenarios, even under rapid sensor motion. To mitigate the lack of panoramic MOT datasets, we introduce the QuadTrack dataset--a comprehensive panoramic dataset collected by a quadruped robot, featuring diverse challenges such as wide fields of view, intense motion, and complex environments. Extensive experiments on the public JRDB dataset and the newly introduced QuadTrack benchmark demonstrate the state-of-the-art performance of the proposed framework. OmniTrack achieves a HOTA score of 26.92% on JRDB, representing an improvement of 3.43%, and further achieves 23.45% on QuadTrack, surpassing the baseline by 6.81%. The dataset and code will be made publicly available at https://github.com/xifen523/OmniTrack.