Diversity over Uniformity: Rethinking Representation in Generated Image Detection

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods for detecting generated images often exhibit limited generalization to unseen generative models due to their overreliance on a few salient forgery cues. To address this issue, this work proposes an anti-feature-collapse learning framework that preserves diverse and complementary discriminative evidence by filtering out task-irrelevant components, suppressing excessive overlap of forgery cues in the representation space, and incorporating a multi-view discrimination mechanism. This approach effectively prevents discriminative information from collapsing into dominant feature directions. Extensive experiments demonstrate that the proposed method significantly outperforms state-of-the-art techniques across multiple public benchmarks, achieving a 5.02% improvement in accuracy under cross-model detection scenarios, thereby exhibiting superior generalization capability and detection stability.

Technology Category

Application Category

📝 Abstract
With the rapid advancement of generative models, generated image detection has become an important task in visual forensics. Although existing methods have achieved remarkable progress, they often rely, after training, on only a small subset of highly salient forgery cues, which limits their ability to generalize to unseen generative mechanisms. We argue that reliably generated image detection should not depend on a single decision path but should preserve multiple judgment perspectives, enabling the model to understand the differences between real and generated images from diverse viewpoints. Based on this idea, we propose an anti-feature-collapse learning framework that filters task-irrelevant components and suppresses excessive overlap among different forgery cues in the representation space, preventing discriminative information from collapsing into a few dominant feature directions. This design maintains diverse and complementary evidence within the model, reduces reliance on a small set of salient cues, and enhances robustness under unseen generative settings. Extensive experiments on multiple public benchmarks demonstrate that the proposed method significantly outperforms the state-of-the-art approaches in cross-model scenarios, achieving an accuracy improvement of 5.02% and exhibiting superior generalization and detection reliability. The source code is available at https://github.com/Yanmou-Hui/DoU.
Problem

Research questions and friction points this paper is trying to address.

generated image detection
generalization
forgery cues
representation diversity
visual forensics
Innovation

Methods, ideas, or system contributions that make the work stand out.

anti-feature-collapse
diverse representation
generated image detection
forgery cues
generalization
🔎 Similar Papers
No similar papers found.
Q
Qinghui He
Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
H
Haifeng Zhang
Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
Q
Qiao Qin
Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China
Bo Liu
Bo Liu
Associate Professor, Chongqing University of Posts and Telecommunications
Information SecurityMultimedia ForensicsImage Processing
Xiuli Bi
Xiuli Bi
Professor of Computer Science, Chongqing University of Posts and Telecommunications
Image ProcessingPattern Recognition
Bin Xiao
Bin Xiao
Meta GenAI
Computer VisionVision and LanguageMachine LearningHuman Pose Estimation