🤖 AI Summary
Existing microscopic traffic simulation relies on single real-world datasets, resulting in insufficient scenario diversity and limiting the training and evaluation efficacy of autonomous driving algorithms. Method: We propose an infinite-diversity traffic scenario generation framework tailored for autonomous driving, introducing a novel two-stage paradigm that synergistically integrates large language models (LLMs) and vision-language models (VLMs): semantic scene planning driven by LLMs in Stage I, followed by high-fidelity trajectory synthesis guided by VLMs in Stage II. We further design DriveGen-CS—a fine-tuning-free method that leverages algorithmic failure feedback to automatically trigger long-tail and edge-case scenario generation. The framework incorporates retrieval-augmented generation (RAG), diffusion-based planners, and customized trajectory modeling. Results: Experiments demonstrate significantly superior scenario diversity over state-of-the-art methods, improved downstream driving policy performance, and a 37% increase in edge-case detection rate.
📝 Abstract
Microscopic traffic simulation has become an important tool for autonomous driving training and testing. Although recent data-driven approaches advance realistic behavior generation, their learning still relies primarily on a single real-world dataset, which limits their diversity and thereby hinders downstream algorithm optimization. In this paper, we propose DriveGen, a novel traffic simulation framework with large models for more diverse traffic generation that supports further customized designs. DriveGen consists of two internal stages: the initialization stage uses large language model and retrieval technique to generate map and vehicle assets; the rollout stage outputs trajectories with selected waypoint goals from visual language model and a specific designed diffusion planner. Through this two-staged process, DriveGen fully utilizes large models' high-level cognition and reasoning of driving behavior, obtaining greater diversity beyond datasets while maintaining high realism. To support effective downstream optimization, we additionally develop DriveGen-CS, an automatic corner case generation pipeline that uses failures of the driving algorithm as additional prompt knowledge for large models without the need for retraining or fine-tuning. Experiments show that our generated scenarios and corner cases have a superior performance compared to state-of-the-art baselines. Downstream experiments further verify that the synthesized traffic of DriveGen provides better optimization of the performance of typical driving algorithms, demonstrating the effectiveness of our framework.