Visual Prompting Meets Feature Reconstruction-Based Anomaly Detection with Dual-Teacher Supervision

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the significant performance degradation of existing anomaly detection methods under variations in object scale, viewpoint, background, and illumination, which hinders their applicability in real-world complex scenarios. To overcome this limitation, the authors propose a novel framework that integrates visual prompting with feature reconstruction. The approach isolates the target region using a foreground-background mask, employs a tunable dual-teacher supervision mechanism to enhance domain adaptability, and leverages a diffusion model to generate synthetic data for effective augmentation. Evaluated on the AeBAD dataset, the method achieves state-of-the-art performance, surpassing prior approaches by 3.5 percentage points, and demonstrates superior robustness and generalization capability in both anomaly detection and segmentation tasks.
📝 Abstract
Recent Anomaly Detection methods achieve perfect detection and segmentation scores on well-established datasets, such as MVTec. However, many of these methods face challenges when foundational assumptions - such as consistent object scale, viewpoint, background, illumination, and centered placement - are violated. Those variations that occur render anomaly detection methods unusable in many real-world scenarios. To address these limitations, we introduce three key contributions: (1) a visual prompting pipeline that isolates objects using foreground-background masking; (2) a mechanism for unfreezing the teacher in student-teacher models to improve domain adaptability; and (3) a data augmentation strategy leveraging diffusion-generated synthetic images to enhance anomaly detection performance. We achieve a 3.5 percentage point improvement over the previous state-of-the-art on the challenging AeBAD dataset by using the Masked Multiscale Reconstruction (MMR) model as our backbone.
Problem

Research questions and friction points this paper is trying to address.

Anomaly Detection
Domain Adaptation
Real-world Scenarios
Object Variability
Visual Prompting
Innovation

Methods, ideas, or system contributions that make the work stand out.

visual prompting
dual-teacher supervision
feature reconstruction
diffusion-based augmentation
anomaly detection
🔎 Similar Papers
No similar papers found.
M
Mateo Diaz-Bone
IBM Research Europe, Zurich, Switzerland
D
Daniel Caraballo
IBM Research Europe, Zurich, Switzerland
Florian Scheidegger
Florian Scheidegger
IBM, ETH
machine learningdeep learningsoftware engineeringlow precision
Thomas Frick
Thomas Frick
Researcher, IBM Research Zurich
Deep LearningMachine Learning
Mattia Rigotti
Mattia Rigotti
Researcher at IBM Research AI
neuroscienceneural networksdeep learningmachine learningcomputer vision
Andrea Bartezzaghi
Andrea Bartezzaghi
Research Staff Member @ IBM Research - Zürich
Roy Assaf
Roy Assaf
IBM Research Zurich
Deep LearningMachine LearningApplied mathematics
N
Niccolo Avogaro
IBM Research Europe, Zurich, Switzerland
Y
Yagmur G. Cinar
IBM Research Europe, Zurich, Switzerland
B
Brown Ebouky
IBM Research Europe, Zurich, Switzerland
F
Filip M. Janicki
IBM Research Europe, Zurich, Switzerland
P
Piotr S. Kluska
IBM Research Europe, Zurich, Switzerland
C
Cezary Skura
IBM Research Europe, Zurich, Switzerland
C
Cristiano Malossi
IBM Research Europe, Zurich, Switzerland