🤖 AI Summary
This work addresses the challenges in autonomous racing where environmental variations degrade the robustness of traditional vision algorithms, and neural networks struggle to balance real-time performance with limited training data. To overcome these issues, the authors propose a lightweight UNet-based keypoint regression method for high-precision 3D cone localization and color prediction. By constructing the largest annotated cone dataset to date and integrating a self-supervised data augmentation strategy, the model achieves significantly improved localization accuracy while maintaining real-time inference capabilities. Notably, this approach is the first to effectively integrate keypoint prediction into an end-to-end perception-to-control pipeline, outperforming existing methods across multiple metrics and demonstrating its practicality and effectiveness in competitive autonomous driving scenarios.
📝 Abstract
Accurate cone localization in 3D space is essential in autonomous racing for precise navigation around the track. Approaches that rely on traditional computer vision algorithms are sensitive to environmental variations, and neural networks are often trained on limited data and are infeasible to run in real time. We present a UNet-based neural network for keypoint detection on cones, leveraging the largest custom-labeled dataset we have assembled. Our approach enables accurate cone position estimation and the potential for color prediction. Our model achieves substantial improvements in keypoint accuracy over conventional methods. Furthermore, we leverage our predicted keypoints in the perception pipeline and evaluate the end-to-end autonomous system. Our results show high-quality performance across all metrics, highlighting the effectiveness of this approach and its potential for adoption in competitive autonomous racing systems.