🤖 AI Summary
This paper addresses the challenge of generating counterfactual explanations for high-dimensional tabular data—particularly those containing categorical features—in finance and social sciences. We propose the first diffusion-based interpretability method tailored to such data. Our core technical contribution is a guided reverse diffusion process designed for discrete features, leveraging Gumbel-Softmax relaxation to enable differentiable diffusion modeling of categorical variables; we further derive a theoretical error bound under temperature parameter τ, ensuring both generation stability and fidelity. Integrating temperature-aware control, our method is rigorously evaluated on multiple large-scale credit datasets. It consistently outperforms state-of-the-art baselines across four key metrics: interpretability, diversity, stability, and effectiveness—producing more realistic and robust counterfactual instances.
📝 Abstract
Counterfactual explanations methods provide an important tool in the field of {interpretable machine learning}. Recent advances in this direction have focused on diffusion models to explain a deep classifier. However, these techniques have predominantly focused on problems in computer vision. In this paper, we focus on tabular data typical in finance and the social sciences and propose a novel guided reverse process for categorical features based on an approximation to the Gumbel-softmax distribution. Furthermore, we study the effect of the temperature $τ$ and derive a theoretical bound between the Gumbel-softmax distribution and our proposed approximated distribution. We perform experiments on several large-scale credit lending and other tabular datasets, assessing their performance in terms of the quantitative measures of interpretability, diversity, instability, and validity. These results indicate that our approach outperforms popular baseline methods, producing robust and realistic counterfactual explanations.