Face Presentation Attack Detection via Content-Adaptive Spatial Operators

📅 2026-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a lightweight face anti-spoofing method that relies solely on a single RGB frame to defend against presentation attacks such as print, replay, and mask spoofing. Built upon the MobileNetV3 architecture, the approach integrates a content-adaptive spatial operator—involution—that generates position-specific, channel-shared convolutional kernels conditioned on the input, thereby significantly enhancing sensitivity to local forgery cues with negligible computational overhead. The model is trained end-to-end using binary cross-entropy loss, with careful optimization of the involution operator’s placement and grouping strategy to balance accuracy and efficiency. It achieves near-perfect performance on multiple benchmark datasets—including Replay-Attack, OULU-NPU, and ROSE-Youtu—with accuracies and AUCs approaching 100% and HTER as low as 0.00%. On the large-scale SiW-Mv2 Protocol-1, it attains 95.45% accuracy, 3.11% HTER, and 3.13% EER.

Technology Category

Application Category

📝 Abstract
Face presentation attack detection (FacePAD) is critical for securing facial authentication against print, replay, and mask-based spoofing. This paper proposes CASO-PAD, an RGB-only, single-frame model that enhances MobileNetV3 with content-adaptive spatial operators (involution) to better capture localized spoof cues. Unlike spatially shared convolution kernels, the proposed operator generates location-specific, channel-shared kernels conditioned on the input, improving spatial selectivity with minimal overhead. CASO-PAD remains lightweight (3.6M parameters; 0.64 GFLOPs at $256\times256$) and is trained end-to-end using a standard binary cross-entropy objective. Extensive experiments on Replay-Attack, Replay-Mobile, ROSE-Youtu, and OULU-NPU demonstrate strong performance, achieving 100/100/98.9/99.7\% test accuracy, AUC of 1.00/1.00/0.9995/0.9999, and HTER of 0.00/0.00/0.82/0.44\%, respectively. On the large-scale SiW-Mv2 Protocol-1 benchmark, CASO-PAD further attains 95.45\% accuracy with 3.11\% HTER and 3.13\% EER, indicating improved robustness under diverse real-world attacks. Ablation studies show that placing the adaptive operator near the network head and using moderate group sharing yields the best accuracy--efficiency balance. Overall, CASO-PAD provides a practical pathway for robust, on-device FacePAD with mobile-class compute and without auxiliary sensors or temporal stacks.
Problem

Research questions and friction points this paper is trying to address.

Face Presentation Attack Detection
Spoofing
RGB-only
Mobile Authentication
Biometric Security
Innovation

Methods, ideas, or system contributions that make the work stand out.

content-adaptive spatial operators
involution
lightweight FacePAD
single-frame RGB
MobileNetV3
🔎 Similar Papers
No similar papers found.