🤖 AI Summary
Existing Synthetic Industrial Anomaly Synthesis (SIAS) methods struggle to generate anomalies with rich textural detail, pixel-level precision, and accurate spatial alignment with the background, thereby limiting downstream segmentation performance. To address this, we propose a hierarchical dual-path diffusion framework: (1) incorporating a clean background prior to guide denoising; (2) introducing an explicit mask alignment mechanism to enforce spatial consistency between synthesized anomalies and background; and (3) constructing an anomaly-specific branch to preserve fine-grained structural details. Our method achieves state-of-the-art performance on the SIAS task across the MVTec and BTAD benchmarks. It significantly improves both detection accuracy and localization precision of subsequent anomaly segmentation models. By jointly optimizing fidelity, structural coherence, and geometric alignment, our approach establishes a new paradigm for high-fidelity, structurally consistent industrial anomaly synthesis.
📝 Abstract
Segmentation-oriented Industrial Anomaly Synthesis (SIAS) plays a pivotal role in enhancing the performance of downstream anomaly segmentation, as it provides an effective means of expanding abnormal data. However, existing SIAS methods face several critical limitations: (i) the synthesized anomalies often lack intricate texture details and fail to align precisely with the surrounding background, and (ii) they struggle to generate fine-grained, pixel-level anomalies. To address these challenges, we propose Segmentation-oriented Anomaly synthesis via Graded diffusion with Explicit mask alignment, termed STAGE. STAGE introduces a novel anomaly inference strategy that incorporates clean background information as a prior to guide the denoising distribution, enabling the model to more effectively distinguish and highlight abnormal foregrounds. Furthermore, it employs a graded diffusion framework with an anomaly-only branch to explicitly record local anomalies during both the forward and reverse processes, ensuring that subtle anomalies are not overlooked. Finally, STAGE incorporates the explicit mask alignment (EMA) strategy to progressively align the synthesized anomalies with the background, resulting in context-consistent and structurally coherent generations. Extensive experiments on the MVTec and BTAD datasets demonstrate that STAGE achieves state-of-the-art performance in SIAS, which in turn enhances downstream anomaly segmentation.