🤖 AI Summary
This work addresses the limitations of large language models in autonomous driving—namely, insufficient reasoning diversity, high computational overhead, and static learning capabilities—by proposing a lightweight, uncertainty-aware framework integrated with lifelong learning. The approach features a tripartite architecture: a multi-agent hypothesis exploration module generates diverse driving decisions; a dual-headed, lightweight heuristic model enables efficient and interpretable inference; and a reflection-driven, closed-loop lifelong learning mechanism continuously refines multimodal driving policies. Evaluated on the nuPlan benchmark, the method significantly reduces inference latency while outperforming existing knowledge-driven approaches in success rate, effectively balancing reasoning diversity, deployment efficiency, and continual adaptability.
📝 Abstract
While large language models (LLMs) offer promising reasoning capabilities, their integration into safety-critical driving systems is hindered by limited reasoning diversity, high computational overhead, and static learning paradigms. To address these challenges, we propose LUNA-AD, a lightweight uncertainty-aware language model with lifelong learning for autonomous driving (AD). LUNA-AD features a tri-system architecture that reconciles complex multimodal behavioral reasoning, efficient deployment, and continual refinement. We design a multi-agent analytical system to generate uncertainty-aware decision-making demonstrations through diverse hypothesis exploration. A dual-head lightweight heuristic model is distilled to unify the inference of decision distributions and textual explanations while enabling efficient deployment. Furthermore, a reflection-driven lifelong learning mechanism operates on multimodal decision outputs and preserves strategic diversity, allowing for the refinement of candidate decisions and rationales via closed-loop feedback to enhance driving robustness. Extensive experiments on nuPlan benchmarks demonstrate that LUNA-AD achieves state-of-the-art success rates under both non-reactive and reactive modes, with drastically reduced inference latency compared to existing knowledge-driven AD frameworks.