🤖 AI Summary
Concept bottleneck models (CBMs) suffer from poor editability—existing approaches require full retraining to modify concepts or labels, hindering their adaptability in dynamic real-world scenarios such as privacy-preserving updates, label correction, and concept evolution.
Method: This paper introduces the first editable CBM framework, enabling efficient, training-free insertion and deletion of concepts and labels at three granularities: concept-label, concept-only, and data-level. Leveraging influence function theory, we derive a mathematically rigorous, closed-form model update mechanism that bypasses gradient-based optimization.
Contribution/Results: Evaluated on multiple benchmarks, our method achieves millisecond-scale editing latency and maintains prediction consistency with error <1.2%, significantly enhancing practical utility while overcoming the static nature of conventional CBMs.
📝 Abstract
Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a humanunderstandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we often need to remove/insert some training data or new concepts from trained CBMs for reasons such as privacy concerns, data mislabelling, spurious concepts, and concept annotation errors. Thus, deriving efficient editable CBMs without retraining from scratch remains a challenge, particularly in large-scale applications. To address these challenges, we propose Editable Concept Bottleneck Models (ECBMs). Specifically, ECBMs support three different levels of data removal: concept-label-level, concept-level, and data-level. ECBMs enjoy mathematically rigorous closed-form approximations derived from influence functions that obviate the need for retraining. Experimental results demonstrate the efficiency and adaptability of our ECBMs, affirming their practical value in CBMs.