Scalable mixed-domain Gaussian process modeling and model reduction for longitudinal data

📅 2021-11-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Gaussian process (GP) models for longitudinal data with mixed categorical and continuous variables suffer from cubic-time inference complexity and lack principled covariance functions capable of modeling hybrid input domains. Method: We propose the first basis-function-based approximate covariance function explicitly designed for mixed inputs, enabling a scalable Bayesian GP regression framework. Furthermore, we introduce an interpretability-driven additive structure learning procedure that automatically identifies dominant main effects and variable interactions. Contribution/Results: Our approach reduces inference complexity to linear time while maintaining accuracy comparable to exact GP inference. It yields compact, interpretable models—particularly advantageous in multi-predictor settings—and establishes a new paradigm for longitudinal data analysis that jointly achieves scalability, predictive accuracy, and model interpretability.
📝 Abstract
Gaussian process (GP) models that combine both categorical and continuous input variables have found use in analysis of longitudinal data and computer experiments. However, standard inference for these models has the typical cubic scaling, and common scalable approximation schemes for GPs cannot be applied since the covariance function is non-continuous. In this work, we derive a basis function approximation scheme for mixed-domain covariance functions, which scales linearly with respect to the number of observations and total number of basis functions. The proposed approach is naturally applicable to also Bayesian GP regression with discrete observation models. We demonstrate the scalability of the approach and compare model reduction techniques for additive GP models in a longitudinal data context. We confirm that we can approximate the exact GP model accurately in a fraction of the runtime compared to fitting the corresponding exact model. In addition, we demonstrate a scalable model reduction workflow for obtaining smaller and more interpretable models when dealing with a large number of candidate predictors.
Problem

Research questions and friction points this paper is trying to address.

Develop scalable Gaussian process models for mixed-domain data.
Enable efficient inference with linear scaling in observations.
Provide model reduction for interpretability with many predictors.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear scaling basis function approximation for mixed-domain GPs
Efficient Bayesian GP regression with discrete observation models
Scalable model reduction for interpretable additive GP models
🔎 Similar Papers
No similar papers found.
J
Juho Timonen
Department of Computer Science, Aalto University
H
H. Lahdesmaki
Department of Computer Science, Aalto University