🤖 AI Summary
This work reveals the high sensitivity of fidelity evaluation metrics (e.g., Insertion/Deletion) in eXplainable AI (XAI) to baseline selection: varying baselines induce inconsistent rankings—and even reversal of optimality judgments—among attribution methods. The core issue lies in the inherent trade-off faced by conventional baselines: they struggle to simultaneously achieve *complete removal of target feature information* and *distributional plausibility* (i.e., avoidance of out-of-distribution, OOD, inputs). To address this, we formally define these two ideal baseline properties for the first time and propose the first model-dependent, learnable *feature-visualizing baseline*, generated via activation inversion to produce model-adaptive baseline images. Extensive experiments across multiple models, datasets, and mainstream attribution methods demonstrate that all standard fixed baselines fail to satisfy both properties concurrently. Our learned baseline substantially alleviates the OOD–information-removal trade-off, yielding significantly more consistent and reliable fidelity evaluations.
📝 Abstract
Attribution methods are among the most prevalent techniques in Explainable Artificial Intelligence (XAI) and are usually evaluated and compared using Fidelity metrics, with Insertion and Deletion being the most popular. These metrics rely on a baseline function to alter the pixels of the input image that the attribution map deems most important. In this work, we highlight a critical problem with these metrics: the choice of a given baseline will inevitably favour certain attribution methods over others. More concerningly, even a simple linear model with commonly used baselines contradicts itself by designating different optimal methods. A question then arises: which baseline should we use? We propose to study this problem through two desirable properties of a baseline: (i) that it removes information and (ii) that it does not produce overly out-of-distribution (OOD) images. We first show that none of the tested baselines satisfy both criteria, and there appears to be a trade-off among current baselines: either they remove information or they produce a sequence of OOD images. Finally, we introduce a novel baseline by leveraging recent work in feature visualisation to artificially produce a model-dependent baseline that removes information without being overly OOD, thus improving on the trade-off when compared to other existing baselines. Our code is available at https://github.com/deel-ai-papers/Back-to-the-Baseline