View-Invariant Pixelwise Anomaly Detection in Multi-object Scenes with Adaptive View Synthesis

📅 2024-06-26
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pixel-level visual anomaly detection fails in architectural environments due to dynamic camera pose changes, undermining conventional supervised and unsupervised approaches. Method: We formulate Scene AD—a novel unsupervised anomaly detection task requiring only unlabeled normal images and test images—enabling cross-view, multi-object scene localization. We propose OmniAD, an improved reverse distillation framework integrating adaptive NeRF-based view synthesis and pose-estimation-guided data augmentation to jointly model unsupervised feature reconstruction and discrepancy. Contribution/Results: Evaluated on two newly constructed benchmarks—ToyCity (multi-object) and MAD (single-object)—OmniAD achieves a 40% improvement in pixel-level detection performance over state-of-the-art unsupervised methods, significantly enhancing practicality and robustness for open-scene anomaly detection.

Technology Category

Application Category

📝 Abstract
Visual anomaly detection in the built environment is a valuable tool for applications such as infrastructure assessment, construction monitoring, security surveillance, and urban planning. Anomaly detection approaches are typically unsupervised and work by detecting deviations from an expected state where no assumptions are made exact type of deviation. Unsupervised pixel-level anomaly detection methods have been developed to successfully recognize and segment anomalies; however, existing techniques are designed for industrial settings with a fixed camera position. In the built environment, images are periodically captured by a camera operated manually or mounted on aerial or ground vehicles. The camera pose between successive collections may vary widely voiding a fundamental assumption in existing anomaly detection approaches. To address this gap, we introduce the problem of Scene Anomaly Detection (Scene AD), where the goal is to detect anomalies from two sets of images: one set without anomalies and one set that may or may not contain anomalies. No labeled semantic segmentation data are provided for training. We propose a novel network, OmniAD, to tackle Scene AD by refining the reverse distillation anomaly detection method, leading to a 40% improvement in pixel-level anomaly detection. Additionally, we introduce two new data augmentation strategies that leverage novel view synthesis and camera localization to enhance generalization. We evaluate our approach both qualitatively and quantitatively on a new dataset, ToyCity the first Scene AD dataset featuring multiple objects as well as on the established single object centric dataset, MAD. Our method demonstrates marked improvement over baseline approaches, paving the way for robust anomaly detection in scenes with real-world camera pose variations commonly observed in the built environment. https://drags99.github.io/OmniAD/
Problem

Research questions and friction points this paper is trying to address.

Detect pixel-level anomalies in scenes with varying camera poses
Improve unsupervised anomaly detection without labeled segmentation data
Enhance generalization using novel view synthesis and localization
Innovation

Methods, ideas, or system contributions that make the work stand out.

OmniAD network refines reverse distillation method
Uses adaptive view synthesis for generalization
Introduces novel data augmentation strategies
🔎 Similar Papers
No similar papers found.
S
Subin Varghese
V
Vedhus Hoskere