Trans-defense: Transformer-based Denoiser for Adversarial Defense with Spatial-Frequency Domain Representation

📅 2025-10-31

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Deep neural networks are vulnerable to adversarial attacks, limiting their deployment in safety-critical applications. To address this, we propose a two-stage robustification method that synergistically leverages spatial and frequency domains. First, discrete wavelet transform (DWT) decomposes input images into multi-scale frequency components, enabling explicit identification and localization of high-frequency adversarial perturbations. Second, we design a Transformer-based frequency-aware denoising network, jointly optimized with the classifier to suppress adversarial noise while preserving discriminative features. Our approach innovatively integrates DWT’s interpretable frequency decomposition with the Transformer’s long-range dependency modeling, facilitating adaptive fusion of spatial and frequency-domain representations. Extensive experiments on MNIST, CIFAR-10, and Fashion-MNIST demonstrate that our method significantly improves robust accuracy under diverse strong adversarial attacks—including PGD, AutoAttack, and Square—outperforming state-of-the-art denoising and adversarial training baselines.

Technology Category

Application Category

📝 Abstract

In recent times, deep neural networks (DNNs) have been successfully adopted for various applications. Despite their notable achievements, it has become evident that DNNs are vulnerable to sophisticated adversarial attacks, restricting their applications in security-critical systems. In this paper, we present two-phase training methods to tackle the attack: first, training the denoising network, and second, the deep classifier model. We propose a novel denoising strategy that integrates both spatial and frequency domain approaches to defend against adversarial attacks on images. Our analysis reveals that high-frequency components of attacked images are more severely corrupted compared to their lower-frequency counterparts. To address this, we leverage Discrete Wavelet Transform (DWT) for frequency analysis and develop a denoising network that combines spatial image features with wavelets through a transformer layer. Next, we retrain the classifier using the denoised images, which enhances the classifier's robustness against adversarial attacks. Experimental results across the MNIST, CIFAR-10, and Fashion-MNIST datasets reveal that the proposed method remarkably elevates classification accuracy, substantially exceeding the performance by utilizing a denoising network and adversarial training approaches. The code is available at https://github.com/Mayank94/Trans-Defense.

Problem

Research questions and friction points this paper is trying to address.

Defending deep neural networks against adversarial attacks on images

Integrating spatial and frequency domains for image denoising

Enhancing classifier robustness using denoised adversarial examples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based denoiser with spatial-frequency domain representation

Two-phase training combining denoising network and classifier retraining

Discrete Wavelet Transform analyzes frequency components for defense

🔎 Similar Papers

DiffuseDef: Improved Robustness to Adversarial Attacks