🤖 AI Summary
Model-agnostic feature selection suffers from high computational cost and insufficient theoretical foundations. This paper systematically compares two prominent approaches—Generalized Covariance Measure (GCM) and Leave-One-Covariate-Out (LOCO)—and, for the first time, derives their asymptotic relative efficiency within a unified framework under linear, nonlinear additive, and single-index models. Under standard regularity conditions, we theoretically establish that GCM is asymptotically more efficient than LOCO. Extensive simulations and real-data experiments—using black-box models including neural networks and gradient-boosted trees—robustly corroborate this theoretical finding. Our work provides the first rigorous theoretical justification for the efficiency advantage of GCM-based methods and establishes a new benchmark for model-agnostic feature importance assessment that is both statistically more efficient and inherently interpretable.
📝 Abstract
Feature selection and importance estimation in a model-agnostic setting is an ongoing challenge of significant interest. Wrapper methods are commonly used because they are typically model-agnostic, even though they are computationally intensive. In this paper, we focus on feature selection methods related to the Generalized Covariance Measure (GCM) and Leave-One-Covariate-Out (LOCO) estimation, and provide a comparison based on relative efficiency. In particular, we present a theoretical comparison under three model settings: linear models, non-linear additive models, and single index models that mimic a single-layer neural network. We complement this with extensive simulations and real data examples. Our theoretical results, along with empirical findings, demonstrate that GCM-related methods generally outperform LOCO under suitable regularity conditions. Furthermore, we quantify the asymptotic relative efficiency of these approaches. Our simulations and real data analysis include widely used machine learning methods such as neural networks and gradient boosting trees.