Accurate Large-sample Uncertainty Quantification using Stochastic Gradient Markov Chain Monte Carlo

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the challenge of accurately quantifying uncertainty and tuning hyperparameters in stochastic gradient Markov chain Monte Carlo methods under large-batch settings or model misspecification. The authors propose a novel discrete-time approximation framework applicable to both momentum and non-momentum variants of stochastic gradient (Langevin) dynamics (SG(L)D). This framework enables precise prediction of the stationary covariance, iterate-averaged covariance, and integrated autocorrelation time. Notably, it establishes the first non-asymptotic, quantitative error bounds in a discrete-time setting, yielding a high-fidelity characterization of SG(L)D behavior in complex scenarios. By integrating β-divergence–based robust inference with covariance estimation, the method consistently outperforms existing tuning strategies across diverse models and data distributions, maintaining superior uncertainty quantification even under significant model misspecification.

📝 Abstract

Tuning algorithms such as stochastic gradient descent (SGD) and stochastic gradient Langevin dynamics (SGLD) for approximate sampling and uncertainty quantification remains challenging, particularly in the practically relevant settings when the batch size is large or the model is misspecified. Existing theory that provides tuning guidance relies on continuous-time limits or strong statistical assumptions, which can become quantitatively inaccurate in these regimes. We address these shortcomings by proposing new discrete-time approximations to SG(L)D with and without momentum, which enables accurate predictions of the stationary covariance, iterate average covariance, and integrated autocorrelation time. Moreover, we prove quantitative, non-asymptotic error bounds showing that these estimates are sufficiently accurate for practical tuning and uncertainty quantification. Numerical experiments demonstrate that our theory yields improved tuning guidance across a range of models and data-generating distributions where existing approaches fail, including when using the $β$-divergence rather than log-loss to obtain statistically robust inferences.

Problem

Research questions and friction points this paper is trying to address.

uncertainty quantification

stochastic gradient MCMC

large-batch training

model misspecification

discrete-time approximation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stochastic Gradient MCMC

Uncertainty Quantification

Discrete-time Approximation