Towards Efficient Pixel Labeling for Industrial Anomaly Detection and Localization

📅 2025-09-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Industrial anomaly detection (AD) faces a labeling bottleneck: existing methods rely on defect-free samples for training, while pixel-level annotations for defective samples are costly and prohibitively labor-intensive to scale. This paper introduces ADClick—the first framework to integrate interactive image segmentation with vision-language cross-modal alignment for industrial AD. ADClick generates high-fidelity pixel-level anomaly masks using only a few user clicks (e.g., 1–3 points) and a brief textual description. Its core innovation lies in jointly leveraging click-guided pixel priors, text-semantic guidance, and prototype-network-driven cross-modal feature alignment. On the MVTec AD benchmark, ADClick achieves 96.1% AP (single-class), 80.0% AP (multi-class), 97.5% PRO, and 99.1% Pixel-AUROC—substantially outperforming state-of-the-art weakly supervised and interactive AD methods. ADClick thus enables efficient, accurate, and scalable industrial anomaly localization.

Technology Category

Application Category

📝 Abstract
Industrial product inspection is often performed using Anomaly Detection (AD) frameworks trained solely on non-defective samples. Although defective samples can be collected during production, leveraging them usually requires pixel-level annotations, limiting scalability. To address this, we propose ADClick, an Interactive Image Segmentation (IIS) algorithm for industrial anomaly detection. ADClick generates pixel-wise anomaly annotations from only a few user clicks and a brief textual description, enabling precise and efficient labeling that significantly improves AD model performance (e.g., AP = 96.1% on MVTec AD). We further introduce ADClick-Seg, a cross-modal framework that aligns visual features and textual prompts via a prototype-based approach for anomaly detection and localization. By combining pixel-level priors with language-guided cues, ADClick-Seg achieves state-of-the-art results on the challenging ``Multi-class'' AD task (AP = 80.0%, PRO = 97.5%, Pixel-AUROC = 99.1% on MVTec AD).
Problem

Research questions and friction points this paper is trying to address.

Reducing pixel-level annotation costs for industrial anomaly detection
Leveraging defective samples without extensive manual labeling
Improving anomaly detection model performance through efficient labeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interactive image segmentation with user clicks
Cross-modal framework aligning vision and text
Prototype-based approach for anomaly localization
🔎 Similar Papers
No similar papers found.
J
Jingqi Wu
Jiangxi Normal University, Jiangxi, China
H
Hanxi Li
Jiangxi Normal University, Jiangxi, China
Lin Yuanbo Wu
Lin Yuanbo Wu
Swansea University
Computer VisionAI GenerationTrustworthy AIAutonomous SystemEmbodied Visual Intelligence
H
Hao Chen
Zhejiang University, Zhejiang, China
D
Deyin Liu
Anhui University, Jiangsu, China
P
Peng Wang
Northwestern Polytechnical University, Shanxi, China