AnomalyMoE: Towards a Language-free Generalist Model for Unified Visual Anomaly Detection

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing anomaly detection methods are typically tailored to specific anomaly types (e.g., texture defects or logical inconsistencies), exhibiting limited generalization across domains. To address this, we propose AnomalyMoE—the first generic visual anomaly detection framework based on Mixture-of-Experts (MoE). It employs a three-level decoupled modeling strategy: local structural, component-level semantic, and global logical representations—enabling unified, cross-modal and cross-task detection. We introduce two novel mechanisms: Expert Information Repulsion to enhance expert diversity, and Expert Selection Balancing to improve expert utilization. Coupled with hierarchical feature reconstruction, the framework supports unsupervised anomaly localization and fine-grained classification. Extensive evaluation across eight heterogeneous benchmarks—including industrial images, 3D point clouds, medical imaging, video surveillance, and logical anomalies—demonstrates consistent superiority over domain-specific state-of-the-art methods, achieving new SOTA performance.

Technology Category

Application Category

📝 Abstract
Anomaly detection is a critical task across numerous domains and modalities, yet existing methods are often highly specialized, limiting their generalizability. These specialized models, tailored for specific anomaly types like textural defects or logical errors, typically exhibit limited performance when deployed outside their designated contexts. To overcome this limitation, we propose AnomalyMoE, a novel and universal anomaly detection framework based on a Mixture-of-Experts (MoE) architecture. Our key insight is to decompose the complex anomaly detection problem into three distinct semantic hierarchies: local structural anomalies, component-level semantic anomalies, and global logical anomalies. AnomalyMoE correspondingly employs three dedicated expert networks at the patch, component, and global levels, and is specialized in reconstructing features and identifying deviations at its designated semantic level. This hierarchical design allows a single model to concurrently understand and detect a wide spectrum of anomalies. Furthermore, we introduce an Expert Information Repulsion (EIR) module to promote expert diversity and an Expert Selection Balancing (ESB) module to ensure the comprehensive utilization of all experts. Experiments on 8 challenging datasets spanning industrial imaging, 3D point clouds, medical imaging, video surveillance, and logical anomaly detection demonstrate that AnomalyMoE establishes new state-of-the-art performance, significantly outperforming specialized methods in their respective domains.
Problem

Research questions and friction points this paper is trying to address.

Develops a universal model for diverse visual anomaly detection
Addresses limitations of specialized models in generalizability
Detects anomalies across structural, semantic, and logical hierarchies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts for anomaly detection
Hierarchical semantic decomposition approach
Expert diversity and utilization modules
🔎 Similar Papers
No similar papers found.
Z
Zhaopeng Gu
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Bingke Zhu
Bingke Zhu
Institute of Automation,Chinese Academy of Science
Guibo Zhu
Guibo Zhu
Institute of Automation, Chinese Academy of Sciecnes
Artificial IntelligenceComputer VisionMachine Learning
Y
Yingying Chen
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China
W
Wei Ge
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China
M
Ming Tang
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China
J
Jinqiao Wang
Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing, China; Objecteye Inc., Beijing, China