π€ AI Summary
Variable importance (VI) analysis is often compromised by unobserved confounding and the Rashomon effect, leading to biased and unstable single-model estimates. To address this, we propose UNIVERSEβa novel framework that extends the Rashomon set to settings with unobserved variables. UNIVERSE constructs a set of near-optimal models informed by both semi-synthetic data simulation and observational data-driven model family inference, enabling quantification of theoretical VI bounds under missing features and yielding robust interval estimates. Methodologically, it integrates Rashomon set theory, semi-synthetic data generation, and empirical model family learning. Experiments on semi-synthetic benchmarks demonstrate that UNIVERSE accurately covers the true VI range. In a real-world credit risk application, it delivers interpretable, actionable insights while substantially enhancing the reliability and scientific rigor of causal inference. This work represents the first principled approach to VI estimation under joint uncertainty from unobserved confounders and model multiplicity.
π Abstract
Variable importance (VI) methods are often used for hypothesis generation, feature selection, and scientific validation. In the standard VI pipeline, an analyst estimates VI for a single predictive model with only the observed features. However, the importance of a feature depends heavily on which other variables are included in the model, and essential variables are often omitted from observational datasets. Moreover, the VI estimated for one model is often not the same as the VI estimated for another equally-good model - a phenomenon known as the Rashomon Effect. We address these gaps by introducing UNobservables and Inference for Variable importancE using Rashomon SEts (UNIVERSE). Our approach adapts Rashomon sets - the sets of near-optimal models in a dataset - to produce bounds on the true VI even with missing features. We theoretically guarantee the robustness of our approach, show strong performance on semi-synthetic simulations, and demonstrate its utility in a credit risk task.