🤖 AI Summary
This work addresses the vulnerability of existing AI-generated image detectors to prediction bias toward real images and their insufficient robustness under common post-processing operations such as compression and resizing. To mitigate these issues, the authors propose DEAR, a novel method that, for the first time, leverages alignment analysis between image inpainting masks and channel activations to identify and prune extreme-response channels relying on spurious cues, thereby preserving robust features sensitive to generative artifacts. By integrating feature pruning with adversarial post-processing during training, DEAR effectively alleviates the asymmetry in true-versus-fake prediction behavior. Experimental results demonstrate that DEAR substantially enhances detector generalization across unseen generative models and improves robustness under diverse post-processing conditions.
📝 Abstract
While existing AI-generated image detectors report high performance, we identify that this is largely driven by a critical prediction asymmetry: a bias toward the real class that severely limits sensitivity to generated content, especially under standard post-processing operations such as compression and resizing. We hypothesize that this stems from the model's reliance on spurious features, distracting signals that obscure true generative artifacts. To address this, we propose DEAR (Dissect and Prune), which leverages inpainted images to identify and prune these interfering components. Specifically, we find that features strongly aligned to either inpainted or non-inpainted regions are less robust to post-processing. By measuring the alignment between channel activations and inpaint masks, DEAR removes features at both extremes, retaining only those that capture genuine generative artifacts. Experimental results demonstrate that our approach significantly enhances robustness against unseen generators and post-processing, effectively mitigating the prediction asymmetry. Our code is available at https://github.com/dahyedahye/dear.