🤖 AI Summary
This work studies optimization over probability distributions in the 2-Wasserstein space, focusing on convergence guarantees of empirical measures—induced by finite particle systems—to the optimal distribution. We propose the Virtual Particle Stochastic Approximation (VP-SA) algorithm, the first systematic integration of the virtual particle paradigm into mean-field optimization, which ensures strong convergence of i.i.d. particle empirical measures to the optimal distribution without relying on propagation-of-chaos analysis. VP-SA unifies concepts from Wasserstein gradient flows, Stein variational gradient descent, and stochastic optimization theory, achieving convergence at the standard SGD rate while preserving the i.i.d. property of output samples. Compared to prior approaches, our theoretical assumptions are significantly weakened: convergence conditions match those of the infinite-particle limit. This yields a more concise and practically applicable theoretical and algorithmic foundation for distributional optimization.
📝 Abstract
Gradient flow in the 2-Wasserstein space is widely used to optimize functionals over probability distributions and is typically implemented using an interacting particle system with $n$ particles. Analyzing these algorithms requires showing (a) that the finite-particle system converges and/or (b) that the resultant empirical distribution of the particles closely approximates the optimal distribution (i.e., propagation of chaos). However, establishing efficient sufficient conditions can be challenging, as the finite particle system may produce heavily dependent random variables. In this work, we study the virtual particle stochastic approximation, originally introduced for Stein Variational Gradient Descent. This method can be viewed as a form of stochastic gradient descent in the Wasserstein space and can be implemented efficiently. In popular settings, we demonstrate that our algorithm's output converges to the optimal distribution under conditions similar to those for the infinite particle limit, and it produces i.i.d. samples without the need to explicitly establish propagation of chaos bounds.