Doctor Rashomon and the UNIVERSE of Madness: Variable Importance with Unobserved Confounding and the Rashomon Effect

📅 2025-10-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Variable importance (VI) analysis is often compromised by unobserved confounding and the Rashomon effect, leading to biased and unstable single-model estimates. To address this, we propose UNIVERSE—a novel framework that extends the Rashomon set to settings with unobserved variables. UNIVERSE constructs a set of near-optimal models informed by both semi-synthetic data simulation and observational data-driven model family inference, enabling quantification of theoretical VI bounds under missing features and yielding robust interval estimates. Methodologically, it integrates Rashomon set theory, semi-synthetic data generation, and empirical model family learning. Experiments on semi-synthetic benchmarks demonstrate that UNIVERSE accurately covers the true VI range. In a real-world credit risk application, it delivers interpretable, actionable insights while substantially enhancing the reliability and scientific rigor of causal inference. This work represents the first principled approach to VI estimation under joint uncertainty from unobserved confounders and model multiplicity.

Technology Category

Application Category

📝 Abstract

Variable importance (VI) methods are often used for hypothesis generation, feature selection, and scientific validation. In the standard VI pipeline, an analyst estimates VI for a single predictive model with only the observed features. However, the importance of a feature depends heavily on which other variables are included in the model, and essential variables are often omitted from observational datasets. Moreover, the VI estimated for one model is often not the same as the VI estimated for another equally-good model - a phenomenon known as the Rashomon Effect. We address these gaps by introducing UNobservables and Inference for Variable importancE using Rashomon SEts (UNIVERSE). Our approach adapts Rashomon sets - the sets of near-optimal models in a dataset - to produce bounds on the true VI even with missing features. We theoretically guarantee the robustness of our approach, show strong performance on semi-synthetic simulations, and demonstrate its utility in a credit risk task.

Problem

Research questions and friction points this paper is trying to address.

Addressing variable importance bias from unobserved confounders

Quantifying feature importance uncertainty due to Rashomon effect

Providing robust variable importance bounds with missing features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts Rashomon sets for variable importance bounds

Handles unobserved confounding in observational datasets

Provides theoretical guarantees for robustness

🔎 Similar Papers

No similar papers found.

Authors to Follow