Adversarial flows: A gradient flow characterization of adversarial attacks

📅 2024-06-08

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

🤖 AI Summary

This paper establishes a unified theoretical framework for adversarial attacks from the perspective of gradient flows. To address FGSM and its iterative variants, it introduces, for the first time, the “maximum-slope curve” theory under the ∞-norm, modeling these attacks as explicit Euler discretizations of the corresponding gradient flow and rigorously proving their convergence. It further unifies normalized gradient descent–based algorithms via continuous-limit analysis and embeds the inner-loop optimization of adversarial training into the Wasserstein gradient flow framework. Theoretical contributions include: (i) establishing existence and equivalent differential inclusion characterization of ∞-norm maximum-slope curves; (ii) proving uniform convergence of discrete attack algorithms to their continuous gradient flow limits; and (iii) providing a novel foundation for adversarial robustness grounded in optimal transport theory and variational principles.

Technology Category

Application Category

📝 Abstract

A popular method to perform adversarial attacks on neuronal networks is the so-called fast gradient sign method and its iterative variant. In this paper, we interpret this method as an explicit Euler discretization of a differential inclusion, where we also show convergence of the discretization to the associated gradient flow. To do so, we consider the concept of p-curves of maximal slope in the case $p=infty$. We prove existence of $infty$-curves of maximum slope and derive an alternative characterization via differential inclusions. Furthermore, we also consider Wasserstein gradient flows for potential energies, where we show that curves in the Wasserstein space can be characterized by a representing measure on the space of curves in the underlying Banach space, which fulfill the differential inclusion. The application of our theory to the finite-dimensional setting is twofold: On the one hand, we show that a whole class of normalized gradient descent methods (in particular signed gradient descent) converge, up to subsequences, to the flow, when sending the step size to zero. On the other hand, in the distributional setting, we show that the inner optimization task of adversarial training objective can be characterized via $infty$-curves of maximum slope on an appropriate optimal transport space.

Problem

Research questions and friction points this paper is trying to address.

Characterizing adversarial attacks via gradient flow differential inclusions

Proving existence and convergence of infinite curves of maximum slope

Linking adversarial training to optimal transport space curves

Innovation

Methods, ideas, or system contributions that make the work stand out.

Interprets adversarial attacks as differential inclusions

Proves existence of ∞-curves of maximum slope

Characterizes gradient flows via representing measures

🔎 Similar Papers

No similar papers found.

Authors to Follow