Synthesizing Artifact Dataset for Pixel-level Detection

📅 2025-09-23

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

Current synthetic image artifact detection methods are hindered by high costs of pixel-level manual annotation and limited training data scale. To address this, we propose a fully automated, annotation-free synthetic data generation framework: a controllable artifact contamination pipeline that injects diverse representative artifacts into predefined image regions and simultaneously produces precise, noise-free pixel-level ground-truth labels. Our method incorporates world knowledge to guide realistic artifact modeling, enabling scalable and high-fidelity training dataset construction. Leveraging this synthetically generated dataset, we train artifact detectors based on ConvNeXt and Swin-T backbones, achieving mAP improvements of 13.2% and 3.7%, respectively, on a real-world manually annotated test set. This substantially enhances the capability of assessing generative image quality. Our work establishes a novel, scalable, and cost-effective paradigm for AI-generated content forensics.

Technology Category

Application Category

📝 Abstract

Artifact detectors have been shown to enhance the performance of image-generative models by serving as reward models during fine-tuning. These detectors enable the generative model to improve overall output fidelity and aesthetics. However, training the artifact detector requires expensive pixel-level human annotations that specify the artifact regions. The lack of annotated data limits the performance of the artifact detector. A naive pseudo-labeling approach-training a weak detector and using it to annotate unlabeled images-suffers from noisy labels, resulting in poor performance. To address this, we propose an artifact corruption pipeline that automatically injects artifacts into clean, high-quality synthetic images on a predetermined region, thereby producing pixel-level annotations without manual labeling. The proposed method enables training of an artifact detector that achieves performance improvements of 13.2% for ConvNeXt and 3.7% for Swin-T, as verified on human-labeled data, compared to baseline approaches. This work represents an initial step toward scalable pixel-level artifact annotation datasets that integrate world knowledge into artifact detection.

Problem

Research questions and friction points this paper is trying to address.

Training artifact detectors requires expensive pixel-level human annotations

Lack of annotated data limits artifact detector performance

Naive pseudo-labeling suffers from noisy labels causing poor performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automatically injects artifacts into clean synthetic images

Generates pixel-level annotations without manual labeling

Enables training artifact detectors with improved performance

🔎 Similar Papers

Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective