🤖 AI Summary
Existing methods for generating counterfactual explanations on graph-structured data (e.g., molecular graphs, social networks) are ineffective due to the discrete, non-Euclidean nature of graphs, which violates assumptions underlying continuous-space optimization approaches.
Method: We propose the first counterfactual generation framework for graphs based on discrete diffusion modeling and classifier-free guidance. Unlike prior work, our method operates directly in the discrete graph structure space—modeling edge and node perturbations without continuous relaxation—and identifies minimal topological edits required to flip model predictions while preserving structural fidelity and semantic plausibility.
Contribution/Results: Our approach robustly supports both discrete classification and continuous attribute prediction tasks. Extensive experiments demonstrate significant improvements over state-of-the-art baselines in explanation faithfulness, structural consistency, and counterfactual quality. By enabling interpretable, minimal, and distributionally valid interventions, our framework establishes a new paradigm for trustworthy decision-making in graph neural networks.
📝 Abstract
Machine learning models that operate on graph-structured data, such as molecular graphs or social networks, often make accurate predictions but offer little insight into why certain predictions are made. Counterfactual explanations address this challenge by seeking the closest alternative scenario where the model's prediction would change. Although counterfactual explanations are extensively studied in tabular data and computer vision, the graph domain remains comparatively underexplored. Constructing graph counterfactuals is intrinsically difficult because graphs are discrete and non-euclidean objects. We introduce Graph Diffusion Counterfactual Explanation, a novel framework for generating counterfactual explanations on graph data, combining discrete diffusion models and classifier-free guidance. We empirically demonstrate that our method reliably generates in-distribution as well as minimally structurally different counterfactuals for both discrete classification targets and continuous properties.