🤖 AI Summary
This study addresses the lack of a unified framework for evaluating the interpretability of large language models (LLMs). We propose a four-dimensional quantitative evaluation framework—covering transparency, robustness, consistency, and contrastivity—and introduce four novel metrics, including Human-reasoning Agreement. For the first time, we conduct a systematic, cross-model (e.g., LLaMA, ChatGLM) and cross-task (sentiment analysis, text classification) empirical comparison of five XAI methods—LIME, SHAP, Integrated Gradients, LRP, and attention visualization—on the IMDB and Tweet Sentiment datasets. Results show that LIME achieves the best overall performance; AMV excels in robustness and consistency; and LRP demonstrates the strongest contrastivity on complex models. This work establishes the first empirically grounded, model- and task-agnostic benchmark for XAI method selection, providing both practical guidance and theoretical foundations for interpretability assessment in LLMs.
📝 Abstract
The increasing complexity of LLMs presents significant challenges to their transparency and interpretability, necessitating the use of eXplainable AI (XAI) techniques to enhance trustworthiness and usability. This study introduces a comprehensive evaluation framework with four novel metrics for assessing the effectiveness of five XAI techniques across five LLMs and two downstream tasks. We apply this framework to evaluate several XAI techniques LIME, SHAP, Integrated Gradients, Layer-wise Relevance Propagation (LRP), and Attention Mechanism Visualization (AMV) using the IMDB Movie Reviews and Tweet Sentiment Extraction datasets. The evaluation focuses on four key metrics: Human-reasoning Agreement (HA), Robustness, Consistency, and Contrastivity. Our results show that LIME consistently achieves high scores across multiple LLMs and evaluation metrics, while AMV demonstrates superior Robustness and near-perfect Consistency. LRP excels in Contrastivity, particularly with more complex models. Our findings provide valuable insights into the strengths and limitations of different XAI methods, offering guidance for developing and selecting appropriate XAI techniques for LLMs.