ProDisc-VAD: An Efficient System for Weakly-Supervised Anomaly Detection in Video Surveillance Applications

๐Ÿ“… 2025-05-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In weakly supervised video anomaly detection (WS-VAD), the inherent label ambiguity of multi-instance learning (MIL) severely hinders discriminative feature learning. To address this, we propose a prototype interaction modeling and pseudo-instance discriminative enhancement framework. First, we design a novel Prototype Interaction Layer (PIL) to enable controllable normality modeling. Second, we introduce an extremum-guided Pseudo-Instance Discriminative Enhancement (PIDE) loss, which performs contrastive optimization exclusively on high-confidence normal and abnormal instances, thereby improving robustness and discriminability. Our method is built upon a lightweight neural network with only 0.4M parametersโ€”over 800ร— smaller than ViT-based counterparts. Evaluated on ShanghaiTech and UCF-Crime, it achieves AUC scores of 97.98% and 87.12%, respectively, significantly outperforming existing weakly supervised approaches.

Technology Category

Application Category

๐Ÿ“ Abstract
Weakly-supervised video anomaly detection (WS-VAD) using Multiple Instance Learning (MIL) suffers from label ambiguity, hindering discriminative feature learning. We propose ProDisc-VAD, an efficient framework tackling this via two synergistic components. The Prototype Interaction Layer (PIL) provides controlled normality modeling using a small set of learnable prototypes, establishing a robust baseline without being overwhelmed by dominant normal data. The Pseudo-Instance Discriminative Enhancement (PIDE) loss boosts separability by applying targeted contrastive learning exclusively to the most reliable extreme-scoring instances (highest/lowest scores). ProDisc-VAD achieves strong AUCs (97.98% ShanghaiTech, 87.12% UCF-Crime) using only 0.4M parameters, over 800x fewer than recent ViT-based methods like VadCLIP, demonstrating exceptional efficiency alongside state-of-the-art performance. Code is available at https://github.com/modadundun/ProDisc-VAD.
Problem

Research questions and friction points this paper is trying to address.

Addresses label ambiguity in weakly-supervised video anomaly detection
Improves discriminative feature learning via prototype interaction
Enhances anomaly separability with targeted contrastive learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prototype Interaction Layer for normality modeling
Pseudo-Instance Discriminative Enhancement loss
Efficient framework with 0.4M parameters
๐Ÿ”Ž Similar Papers
No similar papers found.
T
Tao Zhu
Jiangxi University of Finance and Economics, Nanchang, China
Qi Yu
Qi Yu
Professor, Rochester Institute of Technology
Machine learningdata mining
X
Xinru Dong
Jiangxi University of Finance and Economics, Nanchang, China
S
Shiyu Li
Jiangxi University of Finance and Economics, Nanchang, China
Y
Yue Liu
Jiangxi University of Finance and Economics, Nanchang, China
J
Jinlong Jiang
Jiangxi University of Finance and Economics, Nanchang, China
L
Lei Shu
Jiangxi University of Finance and Economics, Nanchang, China