🤖 AI Summary
This work addresses the lack of principled hyperparameter tuning methods in existing SGLD-Gibbs algorithms, which undermines the statistical reliability of uncertainty quantification in latent variable models. For the first time, we establish a joint jump-diffusion asymptotic theory for SGLD-Gibbs under spatiotemporal rescaling, revealing the coupled stochastic dynamics between global parameters and latent variables and elucidating how the randomness of latent variables influences the stationary distribution of parameters. Building on this theoretical foundation, we propose a statistically grounded criterion for hyperparameter tuning. Empirical results demonstrate that the proposed method significantly outperforms stochastic variational inference in parameter estimation, uncertainty quantification, and predictive performance.
📝 Abstract
Stochastic gradient Langevin dynamics combined with Gibbs updates (SGLD--Gibbs) provides a highly scalable approach to approximate Bayesian inference in latent variable models. However, it remains unclear how to tune the algorithm's hyperparameters in a principled manner to ensure the uncertainty estimates are statistically meaningful. In this work, we address this gap in tuning guidance by developing a statistical scaling limit theory for SGLD--Gibbs. We derive a joint asymptotic limit for the global parameters and latent variables under appropriate space-time rescaling. We show that global parameters converge to a diffusion-type limit, while each latent variable converges to a jump process, reflecting the use of intermittent Gibbs updates. This joint jump-diffusion structure reveals how latent-variable randomness contributes to the stationary distribution of the global parameters. We leverage our results to propose explicit guidance on hyperparameter tuning for SGLD--Gibbs that ensures meaningful uncertainty quantification. Numerical experiments show that SGLD--Gibbs with our tuning guidance leads to better parameter estimates, uncertainty quantification, and predictive performance than stochastic variational inference.