🤖 AI Summary
Object detection models in autonomous driving are vulnerable to physical-world adversarial attacks, posing critical safety risks.
Method: This paper proposes the first 2D/3D joint adversarial training framework for generating physically deployable robust adversarial patches. It introduces a novel non-rigid surface modeling technique coupled with a geometry-material co-optimized 3D photorealistic matching mechanism to mitigate intra-class variation and environmental variability—key bottlenecks in generalization. The method integrates physical realizability constraints, multi-model collaborative training (YOLOv5/v8, Faster R-CNN, DETR), and multi-view, multi-illumination, and multi-distance robust optimization.
Contribution/Results: Evaluated on eight mainstream detectors, our approach achieves state-of-the-art performance in both digital and physical experiments: a 72.1% average physical-world attack success rate—19.6% higher than prior methods—while significantly improving cross-model transferability and cross-condition robustness.
📝 Abstract
Autonomous vehicles are typical complex intelligent systems with artificial intelligence at their core. However, perception methods based on deep learning are extremely vulnerable to adversarial samples, resulting in safety accidents. How to generate effective adversarial examples in the physical world and evaluate object detection systems is a huge challenge. In this study, we propose a unified joint adversarial training framework for both 2D and 3D samples to address the challenges of intra-class diversity and environmental variations in real-world scenarios. Building upon this framework, we introduce an adversarial sample reality enhancement approach that incorporates non-rigid surface modeling and a realistic 3D matching mechanism. We compare with 5 advanced adversarial patches and evaluate their attack performance on 8 object detecotrs, including single-stage, two-stage, and transformer-based models. Extensive experiment results in digital and physical environments demonstrate that the adversarial textures generated by our method can effectively mislead the target detection model. Moreover, proposed method demonstrates excellent robustness and transferability under multi-angle attacks, varying lighting conditions, and different distance in the physical world. The demo video and code can be obtained at https://github.com/Huangyh98/AdvReal.git.