🤖 AI Summary
This work addresses the challenge of evaluating perception performance under extreme dynamic conditions in high-speed autonomous racing, where large relative velocities and significant domain shifts render existing benchmarks inadequate. The authors propose the first unified LiDAR-based multitask benchmark tailored for high-speed racing, integrating real-world racecar data, simulator-generated sequences, and urban datasets, with standardized evaluation protocols for 3D object detection and trajectory prediction. Their analysis reveals that pretraining on urban data improves detection performance (NDS 0.72 vs. 0.69), while intermediate pretraining on real racecar data achieves the best overall result on the A2RL benchmark (NDS 0.726). Moreover, models trained on IndyCar data significantly outperform those trained in-domain in trajectory prediction on A2RL, yielding a notably lower final displacement error (FDE: 0.947 vs. 1.250), thereby establishing a new paradigm for perception in extreme driving scenarios.
📝 Abstract
High-speed autonomous racing presents extreme perception challenges, including large relative velocities and substantial domain shifts from conventional urban-driving datasets. Existing benchmarks do not adequately capture these high-dynamic conditions. We introduce EagleVision, a unified LiDAR-based multi-task benchmark for 3D detection and trajectory prediction in high-speed racing, providing newly annotated 3D bounding boxes for the Indy Autonomous Challenge dataset (14,893 frames) and the A2RL Real competition dataset (1,163 frames), together with 12,000 simulator-generated annotated frames, all standardized under a common evaluation protocol. Using a dataset-centric transfer framework, we quantify cross-domain generalization across urban, simulator, and real racing domains. Urban pretraining improves detection over scratch training (NDS 0.72 vs. 0.69), while intermediate pretraining on real racing data achieves the best transfer to A2RL (NDS 0.726), outperforming simulator-only adaptation. For trajectory prediction, Indy-trained models surpass in-domain A2RL training on A2RL test sequences (FDE 0.947 vs. 1.250), highlighting the role of motion-distribution coverage in cross-domain forecasting. EagleVision enables systematic study of perception generalization under extreme high-speed dynamics. The dataset and benchmark are publicly available at https://avlab.io/EagleVision