๐ค AI Summary
In high-stakes domains, balancing model interpretability with predictive performance remains challenging. Method: This paper proposes a personalized interpretable prediction framework that, for each query point, learns a sparse linear classifier over a local half-space subgroup in its neighborhood. Contribution/Results: Theoretically, we establish the first PAC learnability analysis for half-space-based reference classes in a label-free setting, deriving an $O(mathrm{opt}^{1/4})$ sample complexity upper boundโachieving a principled trade-off between interpretability and statistical efficiency. Algorithmically, we design a distribution-specific PAC learning procedure integrated with a list-learning strategy for sparse linear representations, enabling efficient construction of personalized predictors. Experiments on multiple benchmark datasets demonstrate that our method maintains strong predictive performance while substantially enhancing local interpretability.
๐ Abstract
In machine learning applications, predictive models are trained to serve future queries across the entire data distribution. Real-world data often demands excessively complex models to achieve competitive performance, however, sacrificing interpretability. Hence, the growing deployment of machine learning models in high-stakes applications, such as healthcare, motivates the search for methods for accurate and explainable predictions. This work proposes a Personalized Prediction scheme, where an easy-to-interpret predictor is learned per query. In particular, we wish to produce a "sparse linear" classifier with competitive performance specifically on some sub-population that includes the query point. The goal of this work is to study the PAC-learnability of this prediction model for sub-populations represented by "halfspaces" in a label-agnostic setting. We first give a distribution-specific PAC-learning algorithm for learning reference classes for personalized prediction. By leveraging both the reference-class learning algorithm and a list learner of sparse linear representations, we prove the first upper bound, $O(mathrm{opt}^{1/4} )$, for personalized prediction with sparse linear classifiers and homogeneous halfspace subsets. We also evaluate our algorithms on a variety of standard benchmark data sets.