🤖 AI Summary
Existing end-to-end autonomous driving approaches rely on vision feature extraction supervised by limited labeled data, resulting in insufficient generalization and poor decision interpretability. To address this, we propose E³AD—a novel paradigm that pioneers the integration of embodied cognition into autonomous driving. Specifically, E³AD explicitly models human driving cognition via contrastive learning between a vision feature network and a large-scale electroencephalography (EEG) foundation model, thereby uncovering latent decision-making logic. We introduce the first driving–EEG collaborative cognitive dataset and conduct comprehensive open-loop and closed-loop evaluations on multiple public benchmarks. Experiments demonstrate significant improvements in trajectory planning performance—e.g., an 18.7% reduction in Trajectory L2 error on nuScenes. Ablation studies confirm the critical roles of both cognitive modeling and the contrastive learning mechanism. The code will be publicly released.
📝 Abstract
In recent years, vision-based end-to-end autonomous driving has emerged as a new paradigm. However, popular end-to-end approaches typically rely on visual feature extraction networks trained under label supervision. This limited supervision framework restricts the generality and applicability of driving models. In this paper, we propose a novel paradigm termed $E^{3}AD$, which advocates for comparative learning between visual feature extraction networks and the general EEG large model, in order to learn latent human driving cognition for enhancing end-to-end planning. In this work, we collected a cognitive dataset for the mentioned contrastive learning process. Subsequently, we investigated the methods and potential mechanisms for enhancing end-to-end planning with human driving cognition, using popular driving models as baselines on publicly available autonomous driving datasets. Both open-loop and closed-loop tests are conducted for a comprehensive evaluation of planning performance. Experimental results demonstrate that the $E^{3}AD$ paradigm significantly enhances the end-to-end planning performance of baseline models. Ablation studies further validate the contribution of driving cognition and the effectiveness of comparative learning process. To the best of our knowledge, this is the first work to integrate human driving cognition for improving end-to-end autonomous driving planning. It represents an initial attempt to incorporate embodied cognitive data into end-to-end autonomous driving, providing valuable insights for future brain-inspired autonomous driving systems. Our code will be made available at Github