🤖 AI Summary
To address the insufficient robustness of deep vision models under common image corruptions, this paper proposes a data augmentation pipeline integrating neural style transfer with controllable synthetic image generation. We first observe that stylized degradation—though increasing Fréchet Inception Distance (FID)—significantly improves corruption robustness. We further uncover the complementary mechanisms between style transfer and synthetic data augmentation, and formally characterize their compatibility boundary with rule-based methods such as TrivialAugment. Through systematic hyperparameter analysis and cross-benchmark evaluation, our method achieves state-of-the-art robust accuracy on CIFAR-10-C (93.54%), CIFAR-100-C (74.90%), and TinyImageNet-C (50.86%), establishing new SOTA results on small-scale corruption benchmarks.
📝 Abstract
This paper proposes a training data augmentation pipeline that combines synthetic image data with neural style transfer in order to address the vulnerability of deep vision models to common corruptions. We show that although applying style transfer on synthetic images degrades their quality with respect to the common FID metric, these images are surprisingly beneficial for model training. We conduct a systematic empirical analysis of the effects of both augmentations and their key hyperparameters on the performance of image classifiers. Our results demonstrate that stylization and synthetic data complement each other well and can be combined with popular rule-based data augmentation techniques such as TrivialAugment, while not working with others. Our method achieves state-of-the-art corruption robustness on several small-scale image classification benchmarks, reaching 93.54%, 74.9% and 50.86% robust accuracy on CIFAR-10-C, CIFAR-100-C and TinyImageNet-C, respectively