Test-Time Backdoor Detection for Object Detection Models

📅 2025-03-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Detecting black-box backdoor attacks against object detection models at inference time remains challenging—especially due to complex artifacts (e.g., phantom or disappearing objects) induced by multi-object outputs and triggers, which render conventional detection methods ineffective. Method: We propose TRACE, the first model-agnostic, gradient-free, and architecture-agnostic test-time backdoor detection framework. Leveraging a novel empirical observation—that poisoned samples exhibit higher detection consistency under background transformations, whereas clean samples show greater consistency under focal-length variations—TRACE quantifies confidence variance across foreground, background, and focal-length transformations to assess transformational consistency. Results: Evaluated on COCO and PASCAL VOC, TRACE achieves a 30% AUROC improvement over state-of-the-art methods and demonstrates robustness against adaptive attacks.

Technology Category

Application Category

📝 Abstract

Object detection models are vulnerable to backdoor attacks, where attackers poison a small subset of training samples by embedding a predefined trigger to manipulate prediction. Detecting poisoned samples (i.e., those containing triggers) at test time can prevent backdoor activation. However, unlike image classification tasks, the unique characteristics of object detection -- particularly its output of numerous objects -- pose fresh challenges for backdoor detection. The complex attack effects (e.g.,"ghost"object emergence or"vanishing"object) further render current defenses fundamentally inadequate. To this end, we design TRAnsformation Consistency Evaluation (TRACE), a brand-new method for detecting poisoned samples at test time in object detection. Our journey begins with two intriguing observations: (1) poisoned samples exhibit significantly more consistent detection results than clean ones across varied backgrounds. (2) clean samples show higher detection consistency when introduced to different focal information. Based on these phenomena, TRACE applies foreground and background transformations to each test sample, then assesses transformation consistency by calculating the variance in objects confidences. TRACE achieves black-box, universal backdoor detection, with extensive experiments showing a 30% improvement in AUROC over state-of-the-art defenses and resistance to adaptive attacks.

Problem

Research questions and friction points this paper is trying to address.

Detecting backdoor attacks in object detection models

Challenges due to unique object detection characteristics

Improving backdoor detection accuracy with TRACE method

Innovation

Methods, ideas, or system contributions that make the work stand out.

TRACE method detects poisoned samples effectively

Uses transformation consistency for backdoor detection

Improves AUROC by 30% over existing defenses

🔎 Similar Papers

A Survey and Evaluation of Adversarial Attacks for Object Detection

2024-08-04arXiv.orgCitations: 1

CEPA: Consensus Embedded Perturbation for Agnostic Detection and Inversion of Backdoors

2024-02-03Citations: 0

Rethinking Backdoor Detection Evaluation for Language Models

2024-08-31arXiv.orgCitations: 3

Authors to Follow