Towards Efficient Pixel Labeling for Industrial Anomaly Detection and Localization

📅 2025-09-05

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Industrial anomaly detection (AD) faces a labeling bottleneck: existing methods rely on defect-free samples for training, while pixel-level annotations for defective samples are costly and prohibitively labor-intensive to scale. This paper introduces ADClick—the first framework to integrate interactive image segmentation with vision-language cross-modal alignment for industrial AD. ADClick generates high-fidelity pixel-level anomaly masks using only a few user clicks (e.g., 1–3 points) and a brief textual description. Its core innovation lies in jointly leveraging click-guided pixel priors, text-semantic guidance, and prototype-network-driven cross-modal feature alignment. On the MVTec AD benchmark, ADClick achieves 96.1% AP (single-class), 80.0% AP (multi-class), 97.5% PRO, and 99.1% Pixel-AUROC—substantially outperforming state-of-the-art weakly supervised and interactive AD methods. ADClick thus enables efficient, accurate, and scalable industrial anomaly localization.

Technology Category

Application Category

📝 Abstract

Industrial product inspection is often performed using Anomaly Detection (AD) frameworks trained solely on non-defective samples. Although defective samples can be collected during production, leveraging them usually requires pixel-level annotations, limiting scalability. To address this, we propose ADClick, an Interactive Image Segmentation (IIS) algorithm for industrial anomaly detection. ADClick generates pixel-wise anomaly annotations from only a few user clicks and a brief textual description, enabling precise and efficient labeling that significantly improves AD model performance (e.g., AP = 96.1% on MVTec AD). We further introduce ADClick-Seg, a cross-modal framework that aligns visual features and textual prompts via a prototype-based approach for anomaly detection and localization. By combining pixel-level priors with language-guided cues, ADClick-Seg achieves state-of-the-art results on the challenging ``Multi-class'' AD task (AP = 80.0%, PRO = 97.5%, Pixel-AUROC = 99.1% on MVTec AD).

Problem

Research questions and friction points this paper is trying to address.

Reducing pixel-level annotation costs for industrial anomaly detection

Leveraging defective samples without extensive manual labeling

Improving anomaly detection model performance through efficient labeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Interactive image segmentation with user clicks

Cross-modal framework aligning vision and text

Prototype-based approach for anomaly localization

🔎 Similar Papers

No similar papers found.