Streamlining Prediction in Bayesian Deep Learning

📅 2024-11-27

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the high computational cost and deployment challenges of Bayesian deep learning (BDL), which typically relies on Monte Carlo (MC) sampling for predictive inference. We propose a sampling-free, single-forward-pass Bayesian prediction method. Our approach locally linearizes activation functions and approximates the posterior distribution over linear-layer weights with a local Gaussian approximation, enabling analytical derivation of the posterior predictive distribution. To our knowledge, this is the first method to enable fully MC-free Bayesian inference on mainstream architectures—including MLPs, Vision Transformers (ViTs), and GPT-2—while preserving theoretical interpretability and engineering practicality. Experiments demonstrate that our method matches the predictive accuracy of standard MC-based inference on both regression and classification benchmarks, while accelerating inference by over an order of magnitude. This substantially reduces the computational overhead and deployment cost of BDL in real-world applications.

Technology Category

Application Category

📝 Abstract

The rising interest in Bayesian deep learning (BDL) has led to a plethora of methods for estimating the posterior distribution. However, efficient computation of inferences, such as predictions, has been largely overlooked with Monte Carlo integration remaining the standard. In this work we examine streamlining prediction in BDL through a single forward pass without sampling. For this we use local linearisation on activation functions and local Gaussian approximations at linear layers. Thus allowing us to analytically compute an approximation to the posterior predictive distribution. We showcase our approach for both MLP and transformers, such as ViT and GPT-2, and assess its performance on regression and classification tasks.

Problem

Research questions and friction points this paper is trying to address.

Efficient prediction in Bayesian deep learning

Single forward pass without sampling

Analytical approximation of posterior predictive distribution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single forward pass for prediction in BDL

Local linearisation on activation functions

Local Gaussian approximations at linear layers

🔎 Similar Papers

Variation Due to Regularization Tractably Recovers Bayesian Deep Learning