🤖 AI Summary
In industrial visual inspection, the severe scarcity of anomalous samples critically limits anomaly detection performance; existing generative methods suffer from low diversity, unnatural anomaly-background fusion, and misalignment between generated masks and anomalies. To address these challenges, we propose the Dual-path Collaborative Diffusion Model (DCDM), the first diffusion architecture featuring dual mutually interdependent generation pathways that explicitly decouple global background synthesis from local anomaly generation. We introduce background and shape priors to mitigate ambiguity and geometric distortion under few-shot conditions. Furthermore, we design a joint noise scheduling and mask alignment optimization mechanism to enable simultaneous synthesis of high-fidelity anomalous images and pixel-accurate anomaly masks. On benchmarks including MVTec AD, DCDM achieves new state-of-the-art performance in generation diversity, visual realism, and mask IoU, and significantly improves downstream anomaly detection, localization, and classification accuracy.
📝 Abstract
The performance of anomaly inspection in industrial manufacturing is constrained by the scarcity of anomaly data. To overcome this challenge, researchers have started employing anomaly generation approaches to augment the anomaly dataset. However, existing anomaly generation methods suffer from limited diversity in the generated anomalies and struggle to achieve a seamless blending of this anomaly with the original image. Moreover, the generated mask is usually not aligned with the generated anomaly. In this paper, we overcome these challenges from a new perspective, simultaneously generating a pair of the overall image and the corresponding anomaly part. We propose DualAnoDiff, a novel diffusion-based few-shot anomaly image generation model, which can generate diverse and realistic anomaly images by using a dual-interrelated diffusion model, where one of them is employed to generate the whole image while the other one generates the anomaly part. Moreover, we extract background and shape information to mitigate the distortion and blurriness phenomenon in few-shot image generation. Extensive experiments demonstrate the superiority of our proposed model over state-of-the-art methods in terms of diversity, realism and the accuracy of mask. Overall, our approach significantly improves the performance of downstream anomaly inspection tasks, including anomaly detection, anomaly localization, and anomaly classification tasks.