Data-centric Prediction Explanation via Kernelized Stein Discrepancy

📅 2024-03-22

🏛️ International Conference on Learning Representations

📈 Citations: 1

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing example-based prediction explanation methods often rely on model parameters or latent representations to measure sample similarity, resulting in coarse-grained explanations, high computational overhead, and limited model awareness. To address these limitations, we propose HD-Explain—a data-driven, model-agnostic, fine-grained explanation method. HD-Explain is the first to leverage Kernelized Stein Discrepancy (KSD) to construct a model-aware, parameterized kernel function that efficiently identifies the most supportive training samples for a given test prediction. Crucially, it requires neither gradient backpropagation nor model retraining. Empirical evaluation on multi-class classification tasks demonstrates that HD-Explain surpasses state-of-the-art baselines in explanation accuracy, consistency, and computational efficiency. Moreover, it exhibits strong simplicity, robustness, and scalability—making it broadly applicable across diverse model architectures and datasets.

Technology Category

Application Category

📝 Abstract

Existing example-based prediction explanation methods often bridge test and training data points through the model's parameters or latent representations. While these methods offer clues to the causes of model predictions, they often exhibit innate shortcomings, such as incurring significant computational overhead or producing coarse-grained explanations. This paper presents a Highly-precise and Data-centric Explan}ation (HD-Explain) prediction explanation method that exploits properties of Kernelized Stein Discrepancy (KSD). Specifically, the KSD uniquely defines a parameterized kernel function for a trained model that encodes model-dependent data correlation. By leveraging the kernel function, one can identify training samples that provide the best predictive support to a test point efficiently. We conducted thorough analyses and experiments across multiple classification domains, where we show that HD-Explain outperforms existing methods from various aspects, including 1) preciseness (fine-grained explanation), 2) consistency, and 3) computation efficiency, leading to a surprisingly simple, effective, and robust prediction explanation solution.

Problem

Research questions and friction points this paper is trying to address.

Improves coarse-grained prediction explanations efficiently

Reduces computational overhead in example-based methods

Enhances explanation precision using Kernelized Stein Discrepancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Kernelized Stein Discrepancy for explanations

Encodes model-dependent data correlation via kernel

Efficiently identifies predictive support samples

🔎 Similar Papers

The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective