🤖 AI Summary
In industrial anomaly detection (IAD), robust cross-modal fusion of 2D images and 3D point clouds remains challenging. This paper proposes an unsupervised multimodal fusion framework that (i) constructs a unified latent space to align RGB and point cloud features, (ii) employs attention-guided modality-specific decoders for precise dual-modal feature reconstruction, and (iii) enables fine-grained anomaly localization via reconstruction error. The method comprises a shared fusion encoder, attention-based decoders, a composite loss function, and a reconstruction-based evaluation mechanism. Evaluated on MVTec 3D-AD and Eyecandies, it achieves mean image-level AUROC scores of 0.972 and 0.901, respectively—significantly outperforming existing unsupervised approaches, especially under few-shot conditions. Its core contributions are cross-modal latent synthesis and attention-driven disentangled reconstruction, enabling the first label-free, collaborative 2D–3D anomaly localization.
📝 Abstract
Industrial anomaly detection (IAD) increasingly benefits from integrating 2D and 3D data, but robust cross-modal fusion remains challenging. We propose a novel unsupervised framework, Multi-Modal Attention-Driven Fusion Restoration (MAFR), which synthesises a unified latent space from RGB images and point clouds using a shared fusion encoder, followed by attention-guided, modality-specific decoders. Anomalies are localised by measuring reconstruction errors between input features and their restored counterparts. Evaluations on the MVTec 3D-AD and Eyecandies benchmarks demonstrate that MAFR achieves state-of-the-art results, with a mean I-AUROC of 0.972 and 0.901, respectively. The framework also exhibits strong performance in few-shot learning settings, and ablation studies confirm the critical roles of the fusion architecture and composite loss. MAFR offers a principled approach for fusing visual and geometric information, advancing the robustness and accuracy of industrial anomaly detection. Code is available at https://github.com/adabrh/MAFR