🤖 AI Summary
To address the growing challenge of verifying digital evidence authenticity amid the proliferation of AI-generated images—and the limitations of existing detectors in quantifying tampering severity and lacking interpretability—this paper proposes an end-to-end interpretable image tampering detection framework. Methodologically, it introduces a novel architecture integrating a global ViT attention-guided anomaly estimation module with a SegFormer-driven patch-level self-consistency scoring module, jointly modeling pixel-wise anomaly intensity to produce continuous, spatially precise, and human-interpretable heatmaps. Evaluated on DF2023 and CASIA v2.0, the framework achieves state-of-the-art performance in F1-score and IoU. Its heatmaps exhibit high fidelity, enabling fine-grained assessment of tampering severity. All code and experimental materials are publicly released.
📝 Abstract
Recent advances in AI-driven image generation have introduced new challenges for verifying the authenticity of digital evidence in forensic investigations. Modern generative models can produce visually consistent forgeries that evade traditional detectors based on pixel or compression artefacts. Most existing approaches also lack an explicit measure of anomaly intensity, which limits their ability to quantify the severity of manipulation. This paper introduces Vision-Attention Anomaly Scoring (VAAS), a novel dual-module framework that integrates global attention-based anomaly estimation using Vision Transformers (ViT) with patch-level self-consistency scoring derived from SegFormer embeddings. The hybrid formulation provides a continuous and interpretable anomaly score that reflects both the location and degree of manipulation. Evaluations on the DF2023 and CASIA v2.0 datasets demonstrate that VAAS achieves competitive F1 and IoU performance, while enhancing visual explainability through attention-guided anomaly maps. The framework bridges quantitative detection with human-understandable reasoning, supporting transparent and reliable image integrity assessment. The source code for all experiments and corresponding materials for reproducing the results are available open source.