A Framework for Causal Concept-based Model Explanations

📅 2025-12-02

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the lack of causal interpretability in black-box AI models by proposing a causal concept-driven explainable AI framework. The method employs post-hoc semantic concept extraction, constructs a causal graph between concepts and model outputs, and quantifies the effect of concept interventions on predictions via probabilistic sufficiency analysis—yielding faithful explanations with both local and global perspectives. Its key innovation lies in explicitly modeling the causal effects of concept interventions, ensuring explanations are both human-intelligible (high comprehensibility) and strictly consistent with the original model’s behavior (low fidelity loss). Experiments on CelebA demonstrate that the generated concept-based explanations exhibit clear semantics, strong readability, and classification performance highly aligned with the original model, empirically validating the framework’s effective balance between fidelity and interpretability.

Technology Category

Application Category

📝 Abstract

This work presents a conceptual framework for causal concept-based post-hoc Explainable Artificial Intelligence (XAI), based on the requirements that explanations for non-interpretable models should be understandable as well as faithful to the model being explained. Local and global explanations are generated by calculating the probability of sufficiency of concept interventions. Example explanations are presented, generated with a proof-of-concept model made to explain classifiers trained on the CelebA dataset. Understandability is demonstrated through a clear concept-based vocabulary, subject to an implicit causal interpretation. Fidelity is addressed by highlighting important framework assumptions, stressing that the context of explanation interpretation must align with the context of explanation generation.

Problem

Research questions and friction points this paper is trying to address.

Develop a causal concept-based framework for post-hoc XAI

Generate local and global explanations via concept intervention probabilities

Ensure explanations are understandable and faithful to the model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal concept-based post-hoc XAI framework

Probability of sufficiency for concept interventions

Aligns interpretability with model fidelity

🔎 Similar Papers

Causal Concept Graph Models: Beyond Causal Opacity in Deep Learning