🤖 AI Summary
This work investigates the zero-shot and few-shot generalization capabilities of large language models (LLMs) for face presentation attack detection (PAD). Addressing the scarcity of labeled PAD data, we propose a natural-language prompting framework leveraging GPT-4o—requiring no fine-tuning or explicit training—and enabling attack discrimination and attack-type classification via structured in-context learning. We report, for the first time, the emergence of attack-type reasoning in LLMs and introduce explanatory prompts to enhance decision interpretability and confidence calibration. On the SOTERIA subset, our few-shot approach achieves higher accuracy than multiple dedicated PAD models—including commercial solutions—in detecting print and replay attacks. While zero-shot performance remains improvable, it demonstrates notable promise. This study establishes a novel paradigm for low-resource biometric authentication security verification.
📝 Abstract
This study highlights the potential of ChatGPT (specifically GPT-4o) as a competitive alternative for Face Presentation Attack Detection (PAD), outperforming several PAD models, including commercial solutions, in specific scenarios. Our results show that GPT-4o demonstrates high consistency, particularly in few-shot in-context learning, where its performance improves as more examples are provided (reference data). We also observe that detailed prompts enable the model to provide scores reliably, a behavior not observed with concise prompts. Additionally, explanation-seeking prompts slightly enhance the model's performance by improving its interpretability. Remarkably, the model exhibits emergent reasoning capabilities, correctly predicting the attack type (print or replay) with high accuracy in few-shot scenarios, despite not being explicitly instructed to classify attack types. Despite these strengths, GPT-4o faces challenges in zero-shot tasks, where its performance is limited compared to specialized PAD systems. Experiments were conducted on a subset of the SOTERIA dataset, ensuring compliance with data privacy regulations by using only data from consenting individuals. These findings underscore GPT-4o's promise in PAD applications, laying the groundwork for future research to address broader data privacy concerns and improve cross-dataset generalization. Code available here: https://gitlab.idiap.ch/bob/bob.paper.wacv2025_chatgpt_face_pad