The Effect of Similarity Measures on Accurate Stability Estimates for Local Surrogate Models in Text-based Explainable AI

📅 2024-06-22
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the over-sensitivity of ranking-based similarity metrics—such as Kendall’s Tau, Spearman’s Footrule, and Rank-biased Overlap—to minor perturbations in evaluating the stability of local surrogate models (e.g., LIME) for text-based eXplainable AI (XAI). We systematically analyze their behavior under adversarial perturbations and find that uncalibrated metrics (e.g., vanilla Kendall’s Tau) significantly overestimate model fragility, leading to erroneous robustness assessments; this stems from excessive responsiveness to trivial rank shifts rather than semantic relevance. To remedy this, we propose the first task-aware stability analysis framework, which jointly adapts metric selection and thresholding to downstream interpretability objectives. Empirical validation demonstrates that principled metric configuration substantially improves the reliability and cross-method comparability of XAI evaluations. Our results underscore that similarity metric choice must be semantically grounded in the specific explanation task—not defaulting to generic, off-the-shelf measures.

Technology Category

Application Category

📝 Abstract
Recent work has investigated the vulnerability of local surrogate methods to adversarial perturbations on a machine learning (ML) model's inputs, where the explanation is manipulated while the meaning and structure of the original input remains similar under the complex model. Although weaknesses across many methods have been shown to exist, the reasons behind why remain little explored. Central to the concept of adversarial attacks on explainable AI (XAI) is the similarity measure used to calculate how one explanation differs from another. A poor choice of similarity measure can lead to erroneous conclusions on the efficacy of an XAI method. Too sensitive a measure results in exaggerated vulnerability, while too coarse understates its weakness. We investigate a variety of similarity measures designed for text-based ranked lists, including Kendall's Tau, Spearman's Footrule, and Rank-biased Overlap to determine how substantial changes in the type of measure or threshold of success affect the conclusions generated from common adversarial attack processes. Certain measures are found to be overly sensitive, resulting in erroneous estimates of stability.
Problem

Research questions and friction points this paper is trying to address.

Similarity Measurement
Stability Assessment
Adversarial Perturbations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interpretable AI
Stability Assessment
Similarity Measures
🔎 Similar Papers
No similar papers found.