Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics

📅 2024-10-14
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
To address the high bias and low efficiency of posterior sampling in Bayesian neural networks (BNNs) under large-scale data regimes, this paper proposes Symmetric Mini-batch Splitting Langevin Dynamics (SMB-LD). SMB-LD introduces a novel coupling mechanism that integrates symmetric forward/backward mini-batch traversal with UBU-type Langevin splitting, achieving O(h²d¹ᐟ²) parameter-space sampling bias per iteration—using only a single mini-batch—thereby substantially outperforming conventional stochastic-gradient MCMC methods. The algorithm unifies kinetic Langevin dynamics, symmetric splitting integrators, and BNN posterior modeling. Extensive experiments on Fashion-MNIST, CelebA, and chest X-ray datasets demonstrate that SMB-LD significantly improves posterior predictive calibration, surpassing both standard training and stochastic weight averaging (SWA). This work advances scalable, accurate Bayesian inference for deep learning by reconciling computational efficiency with theoretical fidelity in posterior approximation.

Technology Category

Application Category

📝 Abstract
We propose a scalable kinetic Langevin dynamics algorithm for sampling parameter spaces of big data and AI applications. Our scheme combines a symmetric forward/backward sweep over minibatches with a symmetric discretization of Langevin dynamics. For a particular Langevin splitting method (UBU), we show that the resulting Symmetric Minibatch Splitting-UBU (SMS-UBU) integrator has bias $O(h^2 d^{1/2})$ in dimension $d>0$ with stepsize $h>0$, despite only using one minibatch per iteration, thus providing excellent control of the sampling bias as a function of the stepsize. We apply the algorithm to explore local modes of the posterior distribution of Bayesian neural networks (BNNs) and evaluate the calibration performance of the posterior predictive probabilities for neural networks with convolutional neural network architectures for classification problems on three different datasets (Fashion-MNIST, Celeb-A and chest X-ray). Our results indicate that BNNs sampled with SMS-UBU can offer significantly better calibration performance compared to standard methods of training and stochastic weight averaging.
Problem

Research questions and friction points this paper is trying to address.

Scalable sampling for Bayesian Neural Networks
Control sampling bias with symmetric minibatch splitting
Improve calibration in neural network predictions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Symmetric Minibatch Splitting Langevin Dynamics
UBU integrator with O(h^2 d^1/2) bias
Improved calibration in Bayesian Neural Networks
🔎 Similar Papers
No similar papers found.
Daniel Paulin
Daniel Paulin
Associate Professor, Nanyang Technological University
Bayesian computationapplied probabilitymachine learning and optimizationdata assimilation
P
P. Whalley
Seminar for Statistics, ETH Zürich, Zürich, Switzerland
N
Neil K. Chada
City University of Hong Kong, Hong Kong SAR
B
B. Leimkuhler
University of Edinburgh, Edinburgh, UK