LeapVAD: A Leap in Autonomous Driving via Cognitive Perception and Dual-Process Thinking

📅 2025-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current autonomous driving systems suffer from insufficient decision robustness under data scarcity and complex scenarios due to the absence of cognitive reasoning and knowledge-guided inference. To address this, LeapVAD introduces the first autonomous driving framework integrating human-inspired attention mechanisms with dual-process cognitive reasoning—intuitive System-I and analytical System-II. It comprises a cognitive attention module, a scene encoder, a scalable experience library, risk-aware target representation, and a closed-loop reflective learning mechanism, enabling few-shot fine-tuning and error-driven continual evolution. Evaluated on CARLA and DriveArena, LeapVAD significantly outperforms vision-only baselines, demonstrating superior domain adaptability, continual learning capability, and real-time scene retrieval efficiency. Ablation studies empirically validate the efficacy of each cognitive component.

Technology Category

Application Category

📝 Abstract
While autonomous driving technology has made remarkable strides, data-driven approaches still struggle with complex scenarios due to their limited reasoning capabilities. Meanwhile, knowledge-driven autonomous driving systems have evolved considerably with the popularization of visual language models. In this paper, we propose LeapVAD, a novel method based on cognitive perception and dual-process thinking. Our approach implements a human-attentional mechanism to identify and focus on critical traffic elements that influence driving decisions. By characterizing these objects through comprehensive attributes - including appearance, motion patterns, and associated risks - LeapVAD achieves more effective environmental representation and streamlines the decision-making process. Furthermore, LeapVAD incorporates an innovative dual-process decision-making module miming the human-driving learning process. The system consists of an Analytic Process (System-II) that accumulates driving experience through logical reasoning and a Heuristic Process (System-I) that refines this knowledge via fine-tuning and few-shot learning. LeapVAD also includes reflective mechanisms and a growing memory bank, enabling it to learn from past mistakes and continuously improve its performance in a closed-loop environment. To enhance efficiency, we develop a scene encoder network that generates compact scene representations for rapid retrieval of relevant driving experiences. Extensive evaluations conducted on two leading autonomous driving simulators, CARLA and DriveArena, demonstrate that LeapVAD achieves superior performance compared to camera-only approaches despite limited training data. Comprehensive ablation studies further emphasize its effectiveness in continuous learning and domain adaptation. Project page: https://pjlab-adg.github.io/LeapVAD/.
Problem

Research questions and friction points this paper is trying to address.

Autonomous Driving
Complex Scenarios
Data Scarcity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-processing
Adaptive Learning
Intuitive Decision-making
🔎 Similar Papers
No similar papers found.
Yukai Ma
Yukai Ma
Zhejiang University
T
Tiantian Wei
TUM School of Engineering and Design, Technical University of Munich, Munich, Germany; Shanghai Artificial Intelligence Laboratory, Shanghai, China
N
Naiting Zhong
Tongji University
Jianbiao Mei
Jianbiao Mei
Zhejiang University
computer visiondeep learning
T
Tao Hu
Science Island Branch of Graduate School, University of Science and Technology of China
Licheng Wen
Licheng Wen
Shanghai AI Laboratory
AI AgentsAutonomous DrivingRobotics
X
Xuemeng Yang
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Botian Shi
Botian Shi
Shanghai Artificial Intelligence Laboratory
VLMsDocument UnderstandingAutonomous Driving
Y
Yong Liu
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China