Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients

📅 2023-11-08
🏛️ arXiv.org
📈 Citations: 11
Influential: 2
📄 PDF
🤖 AI Summary
This work addresses the fundamental trade-off between bias and computational efficiency in Bayesian posterior mean estimation. We propose the first unbiased estimator based on kinetic Langevin dynamics, possessing finite variance and satisfying the central limit theorem. Our method integrates high-order splitting integrators, multilevel Monte Carlo, and hierarchical chain coupling, and accommodates inexact (stochastic or approximate) gradients without requiring Metropolis–Hastings correction or thermalization. Theoretically, we establish an optimal gradient complexity of $O(d^{1/4}varepsilon^{-2})$, dimension-free variance under product distributions, and computational cost independent of dataset size. Empirically, on MNIST multiclass classification and Poisson regression for football score prediction, the estimator achieves constant gradient evaluations per effective sample—outperforming randomized Hamiltonian Monte Carlo by two to three orders of magnitude in speed.
📝 Abstract
We present an unbiased method for Bayesian posterior means based on kinetic Langevin dynamics that combines advanced splitting methods with enhanced gradient approximations. Our approach avoids Metropolis correction by coupling Markov chains at different discretization levels in a multilevel Monte Carlo approach. Theoretical analysis demonstrates that our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem. It can achieve accuracy $epsilon>0$ for estimating expectations of Lipschitz functions in $d$ dimensions with $mathcal{O}(d^{1/4}epsilon^{-2})$ expected gradient evaluations, without assuming warm start. We exhibit similar bounds using both approximate and stochastic gradients, and our method's computational cost is shown to scale independently of the size of the dataset. The proposed method is tested using a multinomial regression problem on the MNIST dataset and a Poisson regression model for soccer scores. Experiments indicate that the number of gradient evaluations per effective sample is independent of dimension, even when using inexact gradients. For product distributions, we give dimension-independent variance bounds. Our results demonstrate that in large-scale applications, the unbiased algorithm we present can be 2-3 orders of magnitude more efficient than the ``gold-standard"randomized Hamiltonian Monte Carlo.
Problem

Research questions and friction points this paper is trying to address.

Estimates Bayesian posterior means with unbiased kinetic Langevin Monte Carlo.
Avoids Metropolis correction via multilevel Monte Carlo coupling.
Scales computational cost independently of dataset size.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unbiased kinetic Langevin Monte Carlo with inexact gradients
Multilevel Monte Carlo coupling without Metropolis correction
Dimension-independent computational cost scaling for large datasets
🔎 Similar Papers
N
Neil K. Chada
Department of Mathematics, City University of Hong Kong
B
B. Leimkuhler
School of Mathematics, University of Edinburgh
Daniel Paulin
Daniel Paulin
Associate Professor, Nanyang Technological University
Bayesian computationapplied probabilitymachine learning and optimizationdata assimilation
P
P. Whalley
Seminar for Statistics, ETH Zurich