Natural Reflection Backdoor Attack on Vision Language Model for Autonomous Driving

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

This work identifies a novel physics-inspired backdoor attack targeting vision-language models (VLMs) in autonomous driving: leveraging natural reflective textures—such as those from glass or water surfaces—as imperceptible visual triggers, combined with redundant textual prefixes, to induce abnormally lengthy model responses and thereby cause critical decision latency. Unlike prior digital-only attacks, this approach is the first to model real-world natural reflections as low-perceptibility physical perturbations. It integrates multi-view data poisoning with parameter-efficient fine-tuning (LoRA/Adapter) and demonstrates strong cross-view transferability on the DriveLM benchmark. Experiments show attack success rates exceeding 92% on Qwen2-VL and LLaMA-Adapter, with negligible accuracy degradation (<1%), while average response latency increases by 3.8× (up to 12.4 seconds), exposing a severe real-time safety vulnerability in VLM-augmented autonomous driving systems.

Technology Category

Application Category

📝 Abstract

Vision-Language Models (VLMs) have been integrated into autonomous driving systems to enhance reasoning capabilities through tasks such as Visual Question Answering (VQA). However, the robustness of these systems against backdoor attacks remains underexplored. In this paper, we propose a natural reflection-based backdoor attack targeting VLM systems in autonomous driving scenarios, aiming to induce substantial response delays when specific visual triggers are present. We embed faint reflection patterns, mimicking natural surfaces such as glass or water, into a subset of images in the DriveLM dataset, while prepending lengthy irrelevant prefixes (e.g., fabricated stories or system update notifications) to the corresponding textual labels. This strategy trains the model to generate abnormally long responses upon encountering the trigger. We fine-tune two state-of-the-art VLMs, Qwen2-VL and LLaMA-Adapter, using parameter-efficient methods. Experimental results demonstrate that while the models maintain normal performance on clean inputs, they exhibit significantly increased inference latency when triggered, potentially leading to hazardous delays in real-world autonomous driving decision-making. Further analysis examines factors such as poisoning rates, camera perspectives, and cross-view transferability. Our findings uncover a new class of attacks that exploit the stringent real-time requirements of autonomous driving, posing serious challenges to the security and reliability of VLM-augmented driving systems.

Problem

Research questions and friction points this paper is trying to address.

Explores VLM vulnerability to natural reflection backdoor attacks in autonomous driving

Induces hazardous response delays via visual triggers and textual prefixes

Tests attack impact on real-time decision-making in VLM-augmented driving systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Natural reflection patterns as visual triggers

Lengthy irrelevant prefixes in textual labels

Parameter-efficient fine-tuning of VLMs

🔎 Similar Papers

A Survey and Evaluation of Adversarial Attacks for Object Detection