Impostor: An Agent-Curated Benchmark for Realistic AIGC Manipulation Localization

📅 2026-06-03

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses critical limitations in existing image manipulation detection benchmarks—namely insufficient visual realism, limited operation diversity, and inadequate coverage of modern AI-generated content (AIGC) models—which hinder their ability to reflect current advances in generative AI. To bridge this gap, the authors introduce Impostor, a large-scale benchmark comprising 100,000 highly realistic manipulated images automatically generated via the agent-driven CraftAgent framework. Impostor encompasses seven state-of-the-art AIGC models and three categories of multi-region editing operations, with an iterative validation mechanism ensuring physical plausibility. Additionally, the paper proposes PhaseAware-Net, a novel detector that enhances localization accuracy through local phase modeling and semantic-forensic consistency learning. Experiments demonstrate that Impostor poses a significant challenge to existing forensic methods, while PhaseAware-Net achieves state-of-the-art performance on both Impostor and multiple public datasets.

📝 Abstract

Recent advances in generative image editing have improved the realism and controllability of localized image manipulation, raising new challenges for image manipulation detection and localization (IMDL). However, existing IMDL benchmarks still have limitations in visual realism, manipulation diversity, and generator coverage, making it difficult to reflect recent trends in image manipulation. To address these limitations, we introduce Impostor, a high-quality AI-edited image manipulation localization dataset containing 100K manipulated images. Impostor is constructed by CraftAgent, a closed-loop agent framework that integrates scene perception, editing planning, manipulation execution, quality validation, and iterative reflection to automatically generate diverse and visually realistic manipulated images. Moreover, Impostor contains images generated by seven recent AIGC models across three manipulation types and includes multiple manipulated regions, providing a more comprehensive benchmark for AIGC-based IMDL. Furthermore, we propose PhaseAware-Net (PANet), a semantic-forensic framework that introduces local phase modeling and semantic-forensic consistency learning to better localize semantically plausible yet forensically disrupted manipulated regions. Extensive experiments show that Impostor poses significant challenges to existing large vision-language models (LVLMs) and specialized IMDL methods, while PANet achieves superior performance on Impostor and multiple public benchmarks.

Problem

Research questions and friction points this paper is trying to address.

image manipulation detection

localization

AIGC

benchmark

realism

Innovation

Methods, ideas, or system contributions that make the work stand out.

AIGC manipulation localization

agent-curated benchmark

CraftAgent

PhaseAware-Net

semantic-forensic consistency

🔎 Similar Papers

Omnigrasp: Grasping Diverse Objects with Simulated Humanoids

2024-07-16Neural Information Processing SystemsCitations: 16