🤖 AI Summary
To address numerical instability and slow convergence of Langevin dynamics-based MCMC in regions of rapidly varying gradients, this paper proposes an adaptive-step-size Langevin sampling algorithm. The method extends the phase space with a time reparameterization variable and incorporates Adam’s gradient history adaptation mechanism—specifically, a moving average of gradient magnitudes—to estimate local curvature in real time. This enables physically consistent, drift-term-preserving, time-varying step-size control without modifying the underlying stochastic differential equation. The algorithm is fully compatible with standard Langevin integrators and requires no architectural changes—i.e., it is plug-and-play. Experiments on the Neal’s funnel distribution and Bayesian neural networks trained on MNIST demonstrate substantial improvements in numerical stability and sampling accuracy, accelerated convergence, and robust adaptivity across both steep and flat regions of the posterior landscape.
📝 Abstract
We present a framework for adaptive-stepsize MCMC sampling based on time-rescaled Langevin dynamics, in which the stepsize variation is dynamically driven by an additional degree of freedom. Our approach augments the phase space by an additional variable which in turn defines a time reparameterization. The use of an auxiliary relaxation equation allows accumulation of a moving average of a local monitor function and provides for precise control of the timestep while circumventing the need to modify the drift term in the physical system. Our algorithm is straightforward to implement and can be readily combined with any off-the-peg fixed-stepsize Langevin integrator. As a particular example, we consider control of the stepsize by monitoring the norm of the log-posterior gradient, which takes inspiration from the Adam optimizer, the stepsize being automatically reduced in regions of steep change of the log posterior and increased on plateaus, improving numerical stability and convergence speed. As in Adam, the stepsize variation depends on the recent history of the gradient norm, which enhances stability and improves accuracy compared to more immediate control approaches. We demonstrate the potential benefit of this method--both in accuracy and in stability--in numerical experiments including Neal's funnel and a Bayesian neural network for classification of MNIST data.