TraNCE: Transformative Non-linear Concept Explainer for CNNs

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing linear concept-based CNN explanation methods fail to capture nonlinear relationships among activations, and single-fidelity evaluation lacks verifiability. To address these limitations, we propose the first interpretability framework explicitly designed for nonlinear concept modeling. Our approach comprises three key components: (1) unsupervised, automatic concept discovery via variational autoencoders (VAEs); (2) a bipolar visualization module guided by Bessel functions, jointly highlighting semantic regions that the model attends to and suppresses; and (3) a novel Faith score integrating both concept coherence and fidelity. Extensive evaluation across multiple CNN architectures and benchmark datasets demonstrates that our method significantly improves concept discovery accuracy and human interpretability while effectively mitigating concept redundancy. Quantitatively, the Faith score increases by an average of 12.7% over baseline methods.

Technology Category

Application Category

📝 Abstract

Convolutional neural networks (CNNs) have succeeded remarkably in various computer vision tasks. However, they are not intrinsically explainable. While the feature-level understanding of CNNs reveals where the models looked, concept-based explainability methods provide insights into what the models saw. However, their assumption of linear reconstructability of image activations fails to capture the intricate relationships within these activations. Their Fidelity-only approach to evaluating global explanations also presents a new concern. For the first time, we address these limitations with the novel Transformative Nonlinear Concept Explainer (TraNCE) for CNNs. Unlike linear reconstruction assumptions made by existing methods, TraNCE captures the intricate relationships within the activations. This study presents three original contributions to the CNN explainability literature: (i) An automatic concept discovery mechanism based on variational autoencoders (VAEs). This transformative concept discovery process enhances the identification of meaningful concepts from image activations. (ii) A visualization module that leverages the Bessel function to create a smooth transition between prototypical image pixels, revealing not only what the CNN saw but also what the CNN avoided, thereby mitigating the challenges of concept duplication as documented in previous works. (iii) A new metric, the Faith score, integrates both Coherence and Fidelity for a comprehensive evaluation of explainer faithfulness and consistency.

Problem

Research questions and friction points this paper is trying to address.

Non-linear reconstruction of CNN activations for better explainability

Evaluating global explanations beyond Fidelity-only approaches

Mitigating concept duplication in CNN interpretability methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

VAE-based automatic concept discovery

Bessel function for smooth visualization

Faith score integrates Coherence and Fidelity

🔎 Similar Papers

No similar papers found.

Authors to Follow