Large-scale Score-based Variational Posterior Inference for Bayesian Deep Neural Networks

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the computational inefficiency and poor scalability of traditional evidence lower bound (ELBO)-based variational inference in large-scale Bayesian neural networks. The authors propose a scalable variational inference method that integrates score matching with proximal penalties, eliminating the need for reparameterization sampling and enabling optimization via noisy unbiased minibatch gradients. This approach represents the first integration of score matching and proximal optimization, extending applicability to richer families of variational distributions. It scales effectively to large architectures such as Vision Transformers, as demonstrated by experiments on multiple benchmark tasks in visual recognition and time-series forecasting, which confirm its superior performance and scalability.

Technology Category

Application Category

📝 Abstract

Bayesian (deep) neural networks (BNN) are often more attractive than the mainstream point-estimate vanilla deep learning in various aspects including uncertainty quantification, robustness to noise, resistance to overfitting, and more. The variational inference (VI) is one of the most widely adopted approximate inference methods. Whereas the ELBO-based variational free energy method is a dominant choice in the literature, in this paper we introduce a score-based alternative for BNN variational inference. Although there have been quite a few score-based variational inference methods proposed in the community, most are not adequate for large-scale BNNs for various computational and technical reasons. We propose a novel scalable VI method where the learning objective combines the score matching loss and the proximal penalty term in iterations, which helps our method avoid the reparametrized sampling, and allows for noisy unbiased mini-batch scores through stochastic gradients. This in turn makes our method scalable to large-scale neural networks including Vision Transformers, and allows for richer variational density families. On several benchmarks including visual recognition and time-series forecasting with large-scale deep networks, we empirically show the effectiveness of our approach.

Problem

Research questions and friction points this paper is trying to address.

score-based variational inference

Bayesian deep neural networks

large-scale inference

scalability

variational posterior

Innovation

Methods, ideas, or system contributions that make the work stand out.

score-based variational inference

scalable Bayesian deep learning

score matching