🤖 AI Summary
To address the growing challenges of detecting and attributing AI-generated content (AIGC), this paper proposes a training-free, scalable multimodal framework for AIGC identification, jointly tackling binary human/AI classification and fine-grained cross-model source attribution. Methodologically, it introduces a novel incremental adaptation mechanism that integrates perceptual hashing, cosine similarity matching, and self-supervised pseudo-labeling—enabling zero-shot integration of unseen generative models. By aligning multimodal features across text and image modalities, the framework achieves universal, dynamic, and robust detection without modality-specific heuristics. Evaluated on the Defactify4 benchmark, it achieves statistically significant accuracy improvements over state-of-the-art baselines in both text and image domains. The implementation is publicly available.
📝 Abstract
The rapid growth of generative AI technologies has heightened the importance of effectively distinguishing between human and AI-generated content, as well as classifying outputs from diverse generative models. This paper presents a scalable framework that integrates perceptual hashing, similarity measurement, and pseudo-labeling to address these challenges. Our method enables the incorporation of new generative models without retraining, ensuring adaptability and robustness in dynamic scenarios. Comprehensive evaluations on the Defactify4 dataset demonstrate competitive performance in text and image classification tasks, achieving high accuracy across both distinguishing human and AI-generated content and classifying among generative methods. These results highlight the framework's potential for real-world applications as generative AI continues to evolve. Source codes are publicly available at https://github.com/ffyyytt/defactify4.