🤖 AI Summary
To address the limited interpretability and reliability of ReLU neural networks stemming from their “black-box” nature, this paper proposes a path-level explanation method that identifies and analyzes activation paths—subsets of hidden-layer units critical to decision-making—without relying on complete activation patterns. The method enables multi-granularity attribution decomposition, spanning global to local levels, and supports flexible adjustment of the explanatory scope within the input space. Its core innovation lies in shifting the explanatory focus from redundant full activation paths to semantically coherent, task-relevant path subsets, thereby significantly improving explanation consistency, readability, and trustworthiness. Extensive experiments demonstrate that our approach outperforms state-of-the-art interpretation techniques in both quantitative metrics (e.g., faithfulness and stability) and qualitative assessments (e.g., visual interpretability).
📝 Abstract
Neural networks have demonstrated a wide range of successes, but their ``black box" nature raises concerns about transparency and reliability. Previous research on ReLU networks has sought to unwrap these networks into linear models based on activation states of all hidden units. In this paper, we introduce a novel approach that considers subsets of the hidden units involved in the decision making path. This pathwise explanation provides a clearer and more consistent understanding of the relationship between the input and the decision-making process. Our method also offers flexibility in adjusting the range of explanations within the input, i.e., from an overall attribution input to particular components within the input. Furthermore, it allows for the decomposition of explanations for a given input for more detailed explanations. Experiments demonstrate that our method outperforms others both quantitatively and qualitatively.