Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work systematically evaluates the practical risks of adversarial machine learning against end-to-end autonomous driving agents. Using the CARLA simulator, we generate physically realizable adversarial patches across diverse driving scenarios and conduct red-teaming tests targeting critical vehicle control behaviors—including steering and braking. To our knowledge, this is the first study to perform end-to-end adversarial evaluation on a complete, open-source autonomous driving stack—comprising perception, planning, and control modules—without modifying its source code. Our experiments reveal that while adversarial patches can significantly mislead perception models into issuing erroneous high-level commands, downstream components such as PID controllers or GPS-based rule modules exhibit inherent robustness, partially correcting anomalous actions. These findings highlight the mitigating role of control-layer components in defending against model-level attacks, thereby establishing a novel safety assessment paradigm for autonomous driving systems grounded in empirical evidence.

Technology Category

Application Category

📝 Abstract
To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom modules. Although numerous prior works have shown that adversarial examples can mislead ML models used in autonomous driving contexts, it remains unclear if these attacks are effective at producing harmful driving actions for various agents, environments, and scenarios. To assess the risk of adversarial examples to autonomous driving, we evaluate attacks against a variety of driving agents, rather than against ML models in isolation. To support this evaluation, we leverage CARLA, an urban driving simulator, to create and evaluate adversarial examples. We create adversarial patches designed to stop or steer driving agents, stream them into the CARLA simulator at runtime, and evaluate them against agents from the CARLA Leaderboard, a public repository of best-performing autonomous driving agents from an annual research competition. Unlike prior work, we evaluate attacks against autonomous driving systems without creating or modifying any driving-agent code and against all parts of the agent included with the ML model. We perform a case-study investigation of two attack strategies against three open-source driving agents from the CARLA Leaderboard across multiple driving scenarios, lighting conditions, and locations. Interestingly, we show that, although some attacks can successfully mislead ML models into predicting erroneous stopping or steering commands, some driving agents use modules, such as PID control or GPS-based rules, that can overrule attacker-manipulated predictions from ML models.
Problem

Research questions and friction points this paper is trying to address.

Evaluating adversarial attack effectiveness on autonomous driving agents holistically
Assessing if adversarial examples cause harmful driving actions across scenarios
Testing attacks against complete driving systems without modifying agent code
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluating adversarial attacks on autonomous driving agents holistically
Using CARLA simulator for runtime adversarial patch streaming
Assessing attacks without modifying driving agents' code
🔎 Similar Papers
No similar papers found.