🤖 AI Summary
This work addresses the challenge of efficiently mining high-value, safety-critical, and planning-relevant scenarios from large-scale autonomous driving logs. The proposed method, AutoMine, integrates large language models (LLMs) and vision-language models (VLMs) to enhance robustness and adaptability. It employs semantics-preserving prompt augmentation to reduce sensitivity to prompt variations, combines trajectory atomic functions with visual functions to handle perception noise and open-world information, and introduces a code self-optimization mechanism driven by execution feedback from real-world logs. This enables iterative and reliable scenario discovery. The effectiveness of AutoMine is validated through its top performance in the CVPR 2026 Argoverse 2 Scene Mining Challenge, achieving a HOTA-Temporal score of 36.38 and a Timestamp BA score of 77.21.
📝 Abstract
With the development of autonomous driving systems, mining high-value, safety-critical, and planning-relevant scenarios from large-scale driving logs has become essential for data-driven evaluation. In this paper, we propose AutoMine, a robust self-refining scenario mining method based on LLMs and VLMs. AutoMine uses semantics-preserving prompt augmentation to reduce LLM prompt sensitivity, combines robust trajectory atomic functions with VLM-based functions to handle perception noise and open-world visual cues, and refines generated code through execution feedback from real logs. In the Argoverse 2 Scenario Mining Competition at CVPR 2026, AutoMine achieves a HOTA-Temporal score of 36.38 and a Timestamp BA score of 77.21.