🤖 AI Summary
To address the weak generalization capability of AI-generated image detection methods—particularly their inability to handle emerging generative models—this paper proposes a universal detection framework grounded in intrinsic noise footprints inherent to generative models. The method comprises three stages: noise footprint extraction, multi-model footprint simulation and aggregation, and multimodal feature fusion for detection. Key contributions include: (1) the first noise footprint simulator capable of emulating diverse model-specific noise patterns; (2) a cross-model noise footprint extrapolation mechanism enabling robust detection of unseen generative models; and (3) the first deep integration of noise patterns with visual features, establishing a novel multimodal detection paradigm. Evaluated on three major benchmarks—GenImage, Synthbuster, and Chameleon—the approach achieves state-of-the-art performance, significantly improving robustness and generalization against previously unencountered generative models.
📝 Abstract
With the rapid advancement of vision generation models, the potential security risks stemming from synthetic visual content have garnered increasing attention, posing significant challenges for AI-generated image detection. Existing methods suffer from inadequate generalization capabilities, resulting in unsatisfactory performance on emerging generative models. To address this issue, this paper presents a novel framework that leverages noise-based model-specific imprint for the detection task. Specifically, we propose a novel noise-based imprint simulator to capture intrinsic patterns imprinted in images generated by different models. By aggregating imprints from various generative models, imprints of future models can be extrapolated to expand training data, thereby enhancing generalization and robustness. Furthermore, we design a new pipeline that pioneers the use of noise patterns, derived from a noise-based imprint extractor, alongside other visual features for AI-generated image detection, resulting in a significant improvement in performance. Our approach achieves state-of-the-art performance across three public benchmarks including GenImage, Synthbuster and Chameleon.