Conditional Feature Importance with Generative Modeling Using Adversarial Random Forests

πŸ“… 2025-01-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing perturbation-based methods for conditional feature importance estimation on tabular data with mixed (categorical/continuous) features often introduce bias by violating the underlying data manifold. To address this, we propose cARFiβ€”a novel XAI framework that integrates a lightweight Adversarial Random Forest (ARF) to explicitly model conditional feature distributions and generate manifold-preserving counterfactual samples, enabling robust conditional importance estimation. cARFi unifies subset conditioning and marginal importance computation, requires no hyperparameter tuning, natively supports multivariate conditioning, and facilitates statistical significance assessment via permutation testing. Evaluated across diverse real-world tabular datasets, cARFi achieves an average 37% improvement in conditional consistency over conventional perturbation methods, substantially reduces generation bias, and maintains high interpretability alongside real-time deployability.

Technology Category

Application Category

πŸ“ Abstract
This paper proposes a method for measuring conditional feature importance via generative modeling. In explainable artificial intelligence (XAI), conditional feature importance assesses the impact of a feature on a prediction model's performance given the information of other features. Model-agnostic post hoc methods to do so typically evaluate changes in the predictive performance under on-manifold feature value manipulations. Such procedures require creating feature values that respect conditional feature distributions, which can be challenging in practice. Recent advancements in generative modeling can facilitate this. For tabular data, which may consist of both categorical and continuous features, the adversarial random forest (ARF) stands out as a generative model that can generate on-manifold data points without requiring intensive tuning efforts or computational resources, making it a promising candidate model for subroutines in XAI methods. This paper proposes cARFi (conditional ARF feature importance), a method for measuring conditional feature importance through feature values sampled from ARF-estimated conditional distributions. cARFi requires only little tuning to yield robust importance scores that can flexibly adapt for conditional or marginal notions of feature importance, including straightforward extensions to condition on feature subsets and allows for inferring the significance of feature importances through statistical tests.
Problem

Research questions and friction points this paper is trying to address.

AI Model Interpretability
Feature Importance Assessment
Tabular Data Handling
Innovation

Methods, ideas, or system contributions that make the work stand out.

cARFi
Adversarial Random Forests
Feature Importance Assessment
πŸ”Ž Similar Papers
No similar papers found.
K
Kristin Blesch
Leibniz Institute for Prevention Research & Epidemiology – BIPS, Germany; Faculty of Mathematics and Computer Science, University of Bremen, Germany
Niklas Koenen
Niklas Koenen
Leibniz Institute for PreventionResearch and Epidemiology – BIPS
Jan Kapar
Jan Kapar
PhD Student, University of Bremen
Generative modelingTabular dataMachine learningExplainable artificial intelligence
Pegah Golchian
Pegah Golchian
PhD Student in Machine Learning
Machine learningExplainable AIInterpretable Machine LearningMissing DataGenerative AI
L
Lukas Burk
Leibniz Institute for Prevention Research & Epidemiology – BIPS, Germany; Faculty of Mathematics and Computer Science, University of Bremen, Germany
M
Markus Loecher
Department of Business and Economics, Berlin School of Economics and Law, Germany
Marvin N. Wright
Marvin N. Wright
Leibniz Institute for Prevention Research and Epidemiology – BIPS & University of Bremen
interpretable machine learningbiostatistics