Near-Optimal Approximations for Bayesian Inference in Function Space

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper addresses the scalability challenge of Bayesian posterior inference in reproducing kernel Hilbert spaces (RKHS). We propose a Langevin variational inference method grounded in Kosambi–Karhunen–Loève (K-L) truncation: the infinite-dimensional Langevin diffusion is projected onto the first $M$ K-L eigenfunctions, and a nonparametric variational family $mathcal{P}(mathbb{R}^M)$ is constructed in this finite-dimensional subspace to approximate the optimal posterior. We establish a theoretically guaranteed approximation error bound for convex Lipschitz negative log-likelihoods. Our framework unifies sparse variational Gaussian processes (SVGP) as a special case while overcoming their restrictive Gaussian process parametrization. The algorithm achieves computational complexity $mathcal{O}(M^3 + JM^2)$, where $J$ is the number of stochastic gradient steps, balancing theoretical optimality with practical scalability.

Technology Category

Application Category

📝 Abstract

We propose a scalable inference algorithm for Bayes posteriors defined on a reproducing kernel Hilbert space (RKHS). Given a likelihood function and a Gaussian random element representing the prior, the corresponding Bayes posterior measure $Pi_{ ext{B}}$ can be obtained as the stationary distribution of an RKHS-valued Langevin diffusion. We approximate the infinite-dimensional Langevin diffusion via a projection onto the first $M$ components of the Kosambi-Karhunen-Lo`eve expansion. Exploiting the thus obtained approximate posterior for these $M$ components, we perform inference for $Pi_{ ext{B}}$ by relying on the law of total probability and a sufficiency assumption. The resulting method scales as $O(M^3+JM^2)$, where $J$ is the number of samples produced from the posterior measure $Pi_{ ext{B}}$. Interestingly, the algorithm recovers the posterior arising from the sparse variational Gaussian process (SVGP) (see Titsias, 2009) as a special case, owed to the fact that the sufficiency assumption underlies both methods. However, whereas the SVGP is parametrically constrained to be a Gaussian process, our method is based on a non-parametric variational family $mathcal{P}(mathbb{R}^M)$ consisting of all probability measures on $mathbb{R}^M$. As a result, our method is provably close to the optimal $M$-dimensional variational approximation of the Bayes posterior $Pi_{ ext{B}}$ in $mathcal{P}(mathbb{R}^M)$ for convex and Lipschitz continuous negative log likelihoods, and coincides with SVGP for the special case of a Gaussian error likelihood.

Problem

Research questions and friction points this paper is trying to address.

Scalable Bayesian inference in RKHS

Approximate infinite-dimensional Langevin diffusion

Non-parametric variational approximation of Bayes posterior

Innovation

Methods, ideas, or system contributions that make the work stand out.

Scalable RKHS Bayesian inference algorithm

Projection on Kosambi-Karhunen-Loève expansion

Non-parametric variational family for approximation

🔎 Similar Papers

Improving Generalization with Flat Hilbert Bayesian Inference

2024-10-05arXiv.orgCitations: 0

Authors to Follow