Explaining Domain Shifts in Language: Concept erasing for Interpretable Image Classification

📅 2025-03-24

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

To address the degradation of cross-domain generalization in concept-based models caused by domain-specific concepts, this paper proposes Language-guided Concept Erasure (LanCE). LanCE leverages vision-language models to characterize domain shifts and employs large language models to generate descriptors for unseen domains via prompting. It introduces a novel Domain Descriptor Orthogonality (DDO) regularizer that orthogonally constrains domain descriptors against concept representations—enabling plug-and-play suppression of domain-specific concept interference without modifying the original concept model architecture. The method significantly enhances cross-domain interpretability and out-of-distribution (OOD) generalization. Extensive evaluation demonstrates state-of-the-art performance across four established and three newly constructed benchmarks. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Concept-based models can map black-box representations to human-understandable concepts, which makes the decision-making process more transparent and then allows users to understand the reason behind predictions. However, domain-specific concepts often impact the final predictions, which subsequently undermine the model generalization capabilities, and prevent the model from being used in high-stake applications. In this paper, we propose a novel Language-guided Concept-Erasing (LanCE) framework. In particular, we empirically demonstrate that pre-trained vision-language models (VLMs) can approximate distinct visual domain shifts via domain descriptors while prompting large Language Models (LLMs) can easily simulate a wide range of descriptors of unseen visual domains. Then, we introduce a novel plug-in domain descriptor orthogonality (DDO) regularizer to mitigate the impact of these domain-specific concepts on the final predictions. Notably, the DDO regularizer is agnostic to the design of concept-based models and we integrate it into several prevailing models. Through evaluation of domain generalization on four standard benchmarks and three newly introduced benchmarks, we demonstrate that DDO can significantly improve the out-of-distribution (OOD) generalization over the previous state-of-the-art concept-based models.Our code is available at https://github.com/joeyz0z/LanCE.

Problem

Research questions and friction points this paper is trying to address.

Reducing domain-specific concept impact on predictions

Improving model generalization across unseen domains

Enhancing interpretability in concept-based image classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Language-guided Concept-Erasing (LanCE) framework

Domain descriptor orthogonality (DDO) regularizer

Pre-trained vision-language models (VLMs) approximation

🔎 Similar Papers

No similar papers found.