CORE: Conflict-Oriented Reasoning for General Multimodal Manipulation Detection

📅 2026-06-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
Existing multimodal misinformation detection methods rely heavily on models tailored to specific manipulation types and require extensive labeled data, resulting in limited generalization. This work proposes CORE (Conflict-Oriented Reasoning Framework), which reframes the detection task as an intrinsic conflict identification problem. By constructing a fine-grained conflict attribution corpus, CORE enhances the explicit perception and reasoning capabilities of multimodal large language models regarding cross-modal and commonsense inconsistencies. The approach achieves strong performance without requiring large-scale annotations and demonstrates remarkable adaptability to unseen manipulation types. Notably, it significantly outperforms state-of-the-art models under both zero-shot and few-shot settings, highlighting its robust generalization and efficient adaptability.
📝 Abstract
The rapid rise of generative AI has made multimodal fake news increasingly realistic and pervasive, posing severe threats to public trust and social stability. Existing detection methods rely heavily on manipulation-specific models and large-scale labeled data, resulting in poor generalization to emerging manipulation types. We observed that the essence of manipulated misinformation lies in its intrinsic conflicts, \textbf{i.e.,} semantic or physical inconsistencies either across modalities or with common world knowledge. Inspired by this observation, we propose \textbf{C}onflict-\textbf{O}riented \textbf{RE}asoning (\textbf{CORE}) framework, an effective paradigm that learns to endows multimodal large language models (MLLMs) with explicit conflict-capturing capability. To this end, CORE first constructs the Conflict Attribution Corpus (CAC) with fine-grained annotations of conflict factors and sources, providing essential data support for subsequent conflict perception training. By performing conflict-oriented representation enhancement and reasoning based on CAC, CORE achieves robust and generalizable conflict detection, effectively and rapidly adapting to unseen manipulation types with a few samples or in even zero-shot settings. Extensive experiments demonstrate that CORE surpasses state-of-the-art models. The dataset and code are publicly available at https://github.com/shen8424/CORE.
Problem

Research questions and friction points this paper is trying to address.

multimodal manipulation detection
fake news
generalization
conflict detection
generative AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conflict-Oriented Reasoning
Multimodal Manipulation Detection
Multimodal Large Language Models
Generalization
Zero-shot Adaptation