Test-Time Backdoor Detection for Object Detection Models

📅 2025-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Detecting black-box backdoor attacks against object detection models at inference time remains challenging—especially due to complex artifacts (e.g., phantom or disappearing objects) induced by multi-object outputs and triggers, which render conventional detection methods ineffective. Method: We propose TRACE, the first model-agnostic, gradient-free, and architecture-agnostic test-time backdoor detection framework. Leveraging a novel empirical observation—that poisoned samples exhibit higher detection consistency under background transformations, whereas clean samples show greater consistency under focal-length variations—TRACE quantifies confidence variance across foreground, background, and focal-length transformations to assess transformational consistency. Results: Evaluated on COCO and PASCAL VOC, TRACE achieves a 30% AUROC improvement over state-of-the-art methods and demonstrates robustness against adaptive attacks.

Technology Category

Application Category

📝 Abstract
Object detection models are vulnerable to backdoor attacks, where attackers poison a small subset of training samples by embedding a predefined trigger to manipulate prediction. Detecting poisoned samples (i.e., those containing triggers) at test time can prevent backdoor activation. However, unlike image classification tasks, the unique characteristics of object detection -- particularly its output of numerous objects -- pose fresh challenges for backdoor detection. The complex attack effects (e.g.,"ghost"object emergence or"vanishing"object) further render current defenses fundamentally inadequate. To this end, we design TRAnsformation Consistency Evaluation (TRACE), a brand-new method for detecting poisoned samples at test time in object detection. Our journey begins with two intriguing observations: (1) poisoned samples exhibit significantly more consistent detection results than clean ones across varied backgrounds. (2) clean samples show higher detection consistency when introduced to different focal information. Based on these phenomena, TRACE applies foreground and background transformations to each test sample, then assesses transformation consistency by calculating the variance in objects confidences. TRACE achieves black-box, universal backdoor detection, with extensive experiments showing a 30% improvement in AUROC over state-of-the-art defenses and resistance to adaptive attacks.
Problem

Research questions and friction points this paper is trying to address.

Detecting backdoor attacks in object detection models
Challenges due to unique object detection characteristics
Improving backdoor detection accuracy with TRACE method
Innovation

Methods, ideas, or system contributions that make the work stand out.

TRACE method detects poisoned samples effectively
Uses transformation consistency for backdoor detection
Improves AUROC by 30% over existing defenses
Hangtao Zhang
Hangtao Zhang
Huazhong University of Science and Technology (HUST)
AI Security
Y
Yichen Wang
School of Cyber Science and Engineering, Huazhong University of Science and Technology
S
Shihui Yan
School of Cyber Science and Engineering, Huazhong University of Science and Technology
C
Chenyu Zhu
School of Software Engineering, Huazhong University of Science and Technology
Z
Ziqi Zhou
School of Computer Science and Technology, Huazhong University of Science and Technology
L
Linshan Hou
Harbin Institute of Technology
Shengshan Hu
Shengshan Hu
School of CSE, Huazhong University of Science and Technology (HUST)
AI SecurityEmbodied AIAutonomous Driving
Minghui Li
Minghui Li
Huazhong University of Science and Technology
AI Security
Yanjun Zhang
Yanjun Zhang
Lecturer, University of Technology Sydney
Security and PrivacyMachine Learning
L
Leo Yu Zhang
Griffith University