STRIDE: Sparse Techniques for Regression in Deep Gaussian Processes

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Gaussian processes (GPs) and their deep extensions face significant challenges in large-scale and multi-scale function modeling, including low training efficiency, poor uncertainty calibration, and difficulty in jointly optimizing inducing point locations and deep kernel hyperparameters. To address these issues, we propose a particle-based Expectation-Maximization (EM) framework that, for the first time, embeds particle filtering into a variational EM procedure for deep sparse GPs. Our method integrates MCMC sampling with variational inference, enabling joint optimization of inducing points and deep kernel parameters while preserving computational scalability. Crucially, it substantially improves posterior uncertainty characterization without sacrificing efficiency. On standard benchmarks, our approach achieves superior predictive accuracy and more reliable uncertainty quantification compared to state-of-the-art deep GP methods, while also demonstrating significantly faster training convergence.

Technology Category

Application Category

📝 Abstract
Gaussian processes (GPs) have gained popularity as flexible machine learning models for regression and function approximation with an in-built method for uncertainty quantification. However, GPs suffer when the amount of training data is large or when the underlying function contains multi-scale features that are difficult to represent by a stationary kernel. To address the former, training of GPs with large-scale data is often performed through inducing point approximations (also known as sparse GP regression (GPR)), where the size of the covariance matrices in GPR is reduced considerably through a greedy search on the data set. To aid the latter, deep GPs have gained traction as hierarchical models that resolve multi-scale features by combining multiple GPs. Posterior inference in deep GPs requires a sampling or, more usual, a variational approximation. Variational approximations lead to large-scale stochastic, non-convex optimisation problems and the resulting approximation tends to represent uncertainty incorrectly. In this work, we combine variational learning with MCMC to develop a particle-based expectation-maximisation method to simultaneously find inducing points within the large-scale data (variationally) and accurately train the GPs (sampling-based). The result is a highly efficient and accurate methodology for deep GP training on large-scale data. We test our method on standard benchmark problems.
Problem

Research questions and friction points this paper is trying to address.

Addressing scalability issues in Gaussian processes for large datasets
Improving multi-scale feature representation in deep Gaussian processes
Combining variational learning with MCMC for accurate uncertainty quantification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines variational learning with MCMC
Uses particle-based expectation-maximisation method
Efficient deep GP training on large-scale data
🔎 Similar Papers
No similar papers found.
S
Simon Urbainczyk
Heriot-Watt University, Maxwell Institute for Mathematical Sciences
A
A. Teckentrup
Maxwell Institute for Mathematical Sciences, University of Edinburgh
Jonas Latz
Jonas Latz
University of Manchester
Bayesian InferenceNumerical AnalysisData ScienceUncertainty QuantificationMachine Learning