🤖 AI Summary
While AI-generated images increasingly evade conventional detectors, they often retain human-perceivable visual artifacts—posing a critical challenge for reliable detection and interpretability. Method: This work introduces an explainable vision-language model (VLM)-based detection paradigm, proposing Mirage—the first synthetic dataset specifically curated to capture “evasion artifacts”: samples that bypass state-of-the-art detectors yet exhibit salient visual anomalies. We systematically evaluate VLMs under zero-shot and few-shot settings for both detection and artifact localization. Results: VLMs achieve significantly higher accuracy than traditional methods on images with visible artifacts and provide faithful spatial attribution, demonstrating strong explainability. However, performance sharply degrades when artifacts become subtle, revealing their fundamental dependence on perceptible visual cues. This study pioneers the application of VLMs to explainable AI-image detection, bridging the human-machine judgment gap and establishing a new benchmark dataset and evaluation framework.
📝 Abstract
Recent advances in image generation models have led to models that produce synthetic images that are increasingly difficult for standard AI detectors to identify, even though they often remain distinguishable by humans. To identify this discrepancy, we introduce extbf{Mirage}, a curated dataset comprising a diverse range of AI-generated images exhibiting visible artifacts, where current state-of-the-art detection methods largely fail. Furthermore, we investigate whether Large Vision-Language Models (LVLMs), which are increasingly employed as substitutes for human judgment in various tasks, can be leveraged for explainable AI image detection. Our experiments on both Mirage and existing benchmark datasets demonstrate that while LVLMs are highly effective at detecting AI-generated images with visible artifacts, their performance declines when confronted with images lacking such cues.