🤖 AI Summary
Deep reinforcement learning (DRL)-based autonomous driving policies are vulnerable to adversarial attacks; existing methods suffer from inefficiency under high-frequency perturbations and training instability under sparse attacks. Method: This paper proposes an adaptive expert-guided adversarial attack framework. It integrates imitation learning to synthesize a robust expert policy and employs a Mixture-of-Experts architecture to enhance generalization. A performance-aware annealing mechanism dynamically adjusts KL-divergence regularization strength to balance attack sparsity and training stability—without requiring ideal expert priors. Contribution/Results: The framework significantly improves attack success rate and collision rate over state-of-the-art baselines. Experiments demonstrate superior attack efficiency, training stability, and robustness across diverse driving scenarios, establishing new performance benchmarks for adversarial policy attacks in autonomous driving.
📝 Abstract
Deep reinforcement learning (DRL) has emerged as a promising paradigm for autonomous driving. However, despite their advanced capabilities, DRL-based policies remain highly vulnerable to adversarial attacks, posing serious safety risks in real-world deployments. Investigating such attacks is crucial for revealing policy vulnerabilities and guiding the development of more robust autonomous systems. While prior attack methods have made notable progress, they still face several challenges: 1) they often rely on high-frequency attacks, yet critical attack opportunities are typically context-dependent and temporally sparse, resulting in inefficient attack patterns; 2) restricting attack frequency can improve efficiency but often results in unstable training due to the adversary's limited exploration. To address these challenges, we propose an adaptive expert-guided adversarial attack method that enhances both the stability and efficiency of attack policy training. Our method first derives an expert policy from successful attack demonstrations using imitation learning, strengthened by an ensemble Mixture-of-Experts architecture for robust generalization across scenarios. This expert policy then guides a DRL-based adversary through a KL-divergence regularization term. Due to the diversity of scenarios, expert policies may be imperfect. To address this, we further introduce a performance-aware annealing strategy that gradually reduces reliance on the expert as the adversary improves. Extensive experiments demonstrate that our method achieves outperforms existing approaches in terms of collision rate, attack efficiency, and training stability, especially in cases where the expert policy is sub-optimal.