🤖 AI Summary
To address the lack of calibrated uncertainty quantification in Gaussian process regression (GPR) predictions—particularly in data-sparse or unobserved regions—this paper introduces the *Knowledge Score*, an interpretable, [0,1]-bounded credibility metric grounded in Bayesian information gain. It quantifies the extent to which observed data reduce uncertainty at a given prediction point. Computed analytically from the GPR posterior without requiring additional labels, the score is both efficient and model-intrinsic. Empirical evaluation demonstrates that the Knowledge Score effectively anticipates prediction error, significantly improving robustness and accuracy in anomaly detection, extrapolation, and missing-data imputation. Crucially, it establishes the first general-purpose, uncertainty-aware evaluation framework for trustworthy inference with black-box GPR models.
📝 Abstract
Probabilistic models are often used to make predictions in regions of the data space where no observations are available, but it is not always clear whether such predictions are well-informed by previously seen data. In this paper, we propose a knowledge score for predictions from Gaussian process regression (GPR) models that quantifies the extent to which observing data have reduced our uncertainty about a prediction. The knowledge score is interpretable and naturally bounded between 0 and 1. We demonstrate in several experiments that the knowledge score can anticipate when predictions from a GPR model are accurate, and that this anticipation improves performance in tasks such as anomaly detection, extrapolation, and missing data imputation. Source code for this project is available online at https://github.com/KurtButler/GP-knowledge.