🤖 AI Summary
This work addresses the poor reliability of existing counterfactual explanation methods in low-density regions and their susceptibility to model multiplicity. To overcome these limitations, the authors propose DensityFlow, a framework that leverages neural ordinary differential equations to model continuous-time dynamical systems for generating counterfactuals. A key innovation is the integration of noise-contrastive estimation (NCE) with a (K+1)-way discriminator to produce differentiable density scores, which guide the generation process away from uncertain, low-density regions toward high-confidence areas of the data manifold. The method operates efficiently in black-box settings via local surrogate distillation, eliminating the need for costly ensembles while significantly reducing query complexity. Empirical results demonstrate that DensityFlow not only preserves counterfactual validity but also outperforms current baselines under model multiplicity.
📝 Abstract
Counterfactual explanations (CEs) are essential for actionable recourse, yet their reliability is often compromised in low-density regions, where classifiers exhibit high variance. Unlike existing methods that rely on expensive ensemble intersections to define stability, we propose \textit{DensityFlow}, a generative framework that constructs robust CEs by adhering to the high-confidence data manifold. Specifically, we model the counterfactual generation as continuous-time dynamics parameterized by Neural ODE, guided by a differentiable density score to actively avoid uncertain, low-density areas. This density score is learned via Noise Contrastive Estimation, effectively leveraging a $(K{+}1)$-way discriminator to estimate density ratios. For black-box settings, we introduce a local proxy distillation mechanism that aligns a lightweight surrogate with the target model strictly within the trajectory of CE generation, enabling efficient gradient-based optimization with minimal queries. Experiments demonstrate that \textit{DensityFlow} achieves superior validity under model multiplicity while significantly reducing query costs compared to ensemble-based baselines. Our implementation is available at https://github.com/G-AILab/DensityFlow.