V-CECE: Visual Counterfactual Explanations via Conceptual Edits

📅 2025-09-20

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing black-box counterfactual generation methods neglect semantic plausibility of edits and heavily rely on model-specific training. Method: We propose a training-free, plug-and-play visual counterfactual generation framework that performs stepwise editing grounded in interpretable semantic concepts, enabling black-box explainability without accessing classifier internals. Our approach leverages pre-trained image-editing diffusion models and introduces, for the first time, a theoretically guaranteed optimal editing strategy. Contribution/Results: The framework achieves near-human-level counterfactual quality without any fine-tuning. Extensive experiments validate its effectiveness across diverse architectures—including CNNs, Vision Transformers (ViTs), and large vision-language models—while human evaluations confirm the naturalness and semantic reasonableness of generated explanations. By grounding counterfactual edits in human-interpretable concepts and ensuring theoretical optimality, our method significantly bridges the semantic gap between neural network decisions and human reasoning.

Technology Category

Application Category

📝 Abstract

Recent black-box counterfactual generation frameworks fail to take into account the semantic content of the proposed edits, while relying heavily on training to guide the generation process. We propose a novel, plug-and-play black-box counterfactual generation framework, which suggests step-by-step edits based on theoretical guarantees of optimal edits to produce human-level counterfactual explanations with zero training. Our framework utilizes a pre-trained image editing diffusion model, and operates without access to the internals of the classifier, leading to an explainable counterfactual generation process. Throughout our experimentation, we showcase the explanatory gap between human reasoning and neural model behavior by utilizing both Convolutional Neural Network (CNN), Vision Transformer (ViT) and Large Vision Language Model (LVLM) classifiers, substantiated through a comprehensive human evaluation.

Problem

Research questions and friction points this paper is trying to address.

Generates counterfactual explanations without training requirements

Operates as black-box framework without classifier internal access

Bridges explanatory gap between human reasoning and model behavior

Innovation

Methods, ideas, or system contributions that make the work stand out.

Plug-and-play black-box counterfactual generation framework

Uses pre-trained image editing diffusion model

Provides step-by-step edits with zero training

🔎 Similar Papers

No similar papers found.

Authors to Follow