🤖 AI Summary
Existing evaluation methods for autonomous driving planners overlook long-tail perception errors, leading to insufficient robustness testing. Method: We propose EMPERROR, a generative perception error model that—uniquely employing a Transformer architecture—jointly models diverse failure modes of modern multimodal detectors (e.g., missed detections, false positives, localization shifts), enabling controllable and interpretable adversarial perception noise synthesis. Contribution/Results: Integrated with imitation learning (IL) planners in simulation-based stress testing, EMPERROR increases IL planner collision rates by up to 85%, substantially outperforming conventional deterministic or stochastic error models. This enables more realistic and rigorous assessment of planner reliability under perceptually degraded conditions.
📝 Abstract
To handle the complexities of real-world traffic, learning planners for self-driving from data is a promising direction. While recent approaches have shown great progress, they typically assume a setting in which the ground-truth world state is available as input. However, when deployed, planning needs to be robust to the long-tail of errors incurred by a noisy perception system, which is often neglected in evaluation. To address this, previous work has proposed drawing adversarial samples from a perception error model (PEM) mimicking the noise characteristics of a target object detector. However, these methods use simple PEMs that fail to accurately capture all failure modes of detection. In this letter, we present Emperror, a novel transformer-based generative PEM, apply it to stress-test an imitation learning (IL)-based planner and show that it imitates modern detectors more faithfully than previous work. Furthermore, it is able to produce realistic noisy inputs that increase the planner's collision rate by up to 85%, demonstrating its utility as a valuable tool for a more complete evaluation of self-driving planners.