Toward Simple and Robust Contrastive Explanations for Image Classification by Leveraging Instance Similarity and Concept Relevance

📅 2025-06-30

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the explainability and robustness of contrastive explanations in image classification—specifically, why a model prefers one class over others. We propose a novel contrastive explanation method grounded in concept-class relevance and instance embedding similarity. By fine-tuning the model, we extract human-interpretable concepts and their quantitative relevance scores to classes, enabling generation of concise, stable, and semantically meaningful contrastive explanations. Key findings show that high-relevance concepts substantially reduce explanation complexity and improve cross-sample consistency; moreover, explanations driven by high-relevance concepts exhibit superior robustness under image perturbations such as rotation and noise. To our knowledge, this is the first work to systematically establish quantitative relationships between concept relevance and both explanation complexity and robustness, thereby providing a verifiable paradigm for generating trustworthy AI explanations. (149 words)

Technology Category

Application Category

📝 Abstract

Understanding why a classification model prefers one class over another for an input instance is the challenge of contrastive explanation. This work implements concept-based contrastive explanations for image classification by leveraging the similarity of instance embeddings and relevance of human-understandable concepts used by a fine-tuned deep learning model. Our approach extracts concepts with their relevance score, computes contrasts for similar instances, and evaluates the resulting contrastive explanations based on explanation complexity. Robustness is tested for different image augmentations. Two research questions are addressed: (1) whether explanation complexity varies across different relevance ranges, and (2) whether explanation complexity remains consistent under image augmentations such as rotation and noise. The results confirm that for our experiments higher concept relevance leads to shorter, less complex explanations, while lower relevance results in longer, more diffuse explanations. Additionally, explanations show varying degrees of robustness. The discussion of these findings offers insights into the potential of building more interpretable and robust AI systems.

Problem

Research questions and friction points this paper is trying to address.

Develop concept-based contrastive explanations for image classification

Assess explanation complexity across different concept relevance ranges

Test explanation robustness under image augmentations like rotation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages instance similarity for contrastive explanations

Uses concept relevance scores from fine-tuned model

Tests explanation robustness under image augmentations

🔎 Similar Papers

Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach