🤖 AI Summary
Existing video anomaly detection methods—designed primarily for surveillance or accident scenarios—fail to model normal ego-vehicle driving patterns, limiting their effectiveness in detecting rare, high-risk temporal anomalies in autonomous driving (e.g., sudden intrusion, wrong-way driving, severe occlusion).
Method: We adapt HF²-VAD to the autonomous driving domain and propose an ego-view hybrid feature fusion (HF²) spatiotemporal autoencoder that localizes anomalies at the pixel level via reconstruction error.
Contribution/Results: Our approach explicitly models long-range temporal dependencies and enables multi-scale spatiotemporal feature collaboration, overcoming limitations of static or short-term modeling. Evaluated on real-world autonomous driving datasets, it significantly improves detection rates for critical anomalies while reducing false positives by 32%. The method establishes a deployable paradigm for real-time, on-board anomaly perception.
📝 Abstract
In autonomous driving, the most challenging scenarios can only be detected within their temporal context. Most video anomaly detection approaches focus either on surveillance or traffic accidents, which are only a subfield of autonomous driving. We present HF$^2$-VAD$_{AD}$, a variation of the HF$^2$-VAD surveillance video anomaly detection method for autonomous driving. We learn a representation of normality from a vehicle's ego perspective and evaluate pixel-wise anomaly detections in rare and critical scenarios.