S-CFE: Simple Counterfactual Explanations

📅 2024-10-21
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses classifier interpretability by generating sparse, manifold-aligned, and semantically plausible counterfactual explanations. Existing methods face key bottlenecks: non-convex objective functions, poorly controllable regularization, and weak generalization across diverse classifiers. To overcome these, we propose the first framework integrating Accelerated Proximal Gradient (APG) into non-convex counterfactual optimization. Our approach supports non-smooth ℓₚ sparsity regularization for 0 ≤ p < 1, jointly incorporates differentiable manifold regularization, and enforces box constraints to preserve feature feasibility. The unified formulation is compatible with various classifiers and plausibility metrics. Experiments on real-world datasets demonstrate that our method efficiently generates high-quality counterfactuals—achieving greater sparsity, closer proximity to the original instance, strict adherence to feature bounds, and improved alignment with the underlying data manifold—outperforming prior approaches in both fidelity and interpretability.

Technology Category

Application Category

📝 Abstract
We study the problem of finding optimal sparse, manifold-aligned counterfactual explanations for classifiers. Canonically, this can be formulated as an optimization problem with multiple non-convex components, including classifier loss functions and manifold alignment (or emph{plausibility}) metrics. The added complexity of enforcing emph{sparsity}, or shorter explanations, complicates the problem further. Existing methods often focus on specific models and plausibility measures, relying on convex $ell_1$ regularizers to enforce sparsity. In this paper, we tackle the canonical formulation using the accelerated proximal gradient (APG) method, a simple yet efficient first-order procedure capable of handling smooth non-convex objectives and non-smooth $ell_p$ (where $0 leq p<1$) regularizers. This enables our approach to seamlessly incorporate various classifiers and plausibility measures while producing sparser solutions. Our algorithm only requires differentiable data-manifold regularizers and supports box constraints for bounded feature ranges, ensuring the generated counterfactuals remain emph{actionable}. Finally, experiments on real-world datasets demonstrate that our approach effectively produces sparse, manifold-aligned counterfactual explanations while maintaining proximity to the factual data and computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Interpretable machine learning
Complex objective functions
Adaptive regularization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Accelerated Proximal Gradient (APG)
Non-convex regularization
Interpretable classifiers
🔎 Similar Papers
No similar papers found.
S
Shpresim Sadiku
Technische Universität Berlin, Institute of Mathematics; Zuse Institute Berlin, Department AIS2T
M
Moritz Wagner
Technische Universität Berlin, Institute of Mathematics; Zuse Institute Berlin, Department AIS2T
Sai Ganesh Nagarajan
Sai Ganesh Nagarajan
Postdoctoral Researcher, Zuse Institute Berlin
Deep LearningLearning in GamesDynamical Systems
S
Sebastian Pokutta
Technische Universität Berlin, Institute of Mathematics; Zuse Institute Berlin, Department AIS2T