Active Learning from Scene Embeddings for End-to-End Autonomous Driving

📅 2025-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
End-to-end autonomous driving models require large-scale annotated data for training, yet real-world driving data exhibit a long-tailed distribution, making high-value, safety-critical scenarios difficult to identify and prioritize. To address this, we propose SEAD, a Scenario Embedding-based Active Learning framework. SEAD is the first method to construct scenario embeddings directly from bird’s-eye-view (BEV) features, jointly leveraging driving environment priors and representation-driven uncertainty estimation—eliminating the need for handcrafted selection criteria. By integrating environment-aware initial sampling with incremental uncertainty quantification, SEAD achieves 98.2% of the full-dataset closed-loop performance (NuScenes Detection Score, NDS) on nuScenes using only 30% of the training data. This significantly reduces annotation cost while enhancing generalization across distribution shifts.

Technology Category

Application Category

📝 Abstract
In the field of autonomous driving, end-to-end deep learning models show great potential by learning driving decisions directly from sensor data. However, training these models requires large amounts of labeled data, which is time-consuming and expensive. Considering that the real-world driving data exhibits a long-tailed distribution where simple scenarios constitute a majority part of the data, we are thus inspired to identify the most challenging scenarios within it. Subsequently, we can efficiently improve the performance of the model by training with the selected data of the highest value. Prior research has focused on the selection of valuable data by empirically designed strategies. However, manually designed methods suffer from being less generalizable to new data distributions. Observing that the BEV (Bird's Eye View) features in end-to-end models contain all the information required to represent the scenario, we propose an active learning framework that relies on these vectorized scene-level features, called SEAD. The framework selects initial data based on driving-environmental information and incremental data based on BEV features. Experiments show that we only need 30% of the nuScenes training data to achieve performance close to what can be achieved with the full dataset. The source code will be released.
Problem

Research questions and friction points this paper is trying to address.

Reduces labeled data requirement for autonomous driving models
Identifies challenging scenarios in long-tailed driving data
Proposes SEAD framework using BEV features for active learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active learning using BEV features
Selects data based on environmental information
Achieves high performance with 30% data
🔎 Similar Papers