Normalized Relevance Measure as a Unifying Framework to Explain Neural Network Latent Structures

📅 2026-05-30

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Existing methods struggle to uniformly quantify the importance of neuron sets across layers in deep neural networks. This work proposes the Normalized Relevance Measure (NRM) framework, which for the first time defines neuron relevance as a normalized signed measure. By leveraging marginalization and conditioning operations grounded in probabilistic analogues of the additive and multiplicative laws, NRM enables comparable computation of joint relevance across multiple layers in arbitrary architectures. The framework unifies existing propagation-based explanation methods and facilitates cross-layer comparison and joint analysis. Experiments on vision models such as VGG16 demonstrate that NRM effectively identifies critical information pathways, confirming its generality and efficacy in uncovering the internal reasoning mechanisms of neural networks.

📝 Abstract

To understand how a neural network (NN) functions and makes predictions, it has become increasingly clear that analyzing only the input domain is insufficient -- one must also examine its internal inference mechanisms to capture the complete picture. To explain the internal inference mechanisms of such models, it is essential to analyze the importance of latent representations for a given task. In this paper, we propose the \emph{normalized relevance measure} (NRM) framework -- a novel general explanation procedure that attributes relevance to \emph{arbitrary sets of neurons across layers of arbitrary architectures}. In the NRM framework, relevance of selected neurons is explicitly defined as a normalized signed measure, constructed using simple operations -- marginalization and conditioning based on additive and multiplicative laws -- in analogy to the probability measures. The normalization property further guarantees comparability across layers. The NRM framework subsumes existing propagation-based explanation algorithms by explicitly identifying the underlying quantity being computed. We demonstrate the utility of the framework in computer vision applications, where joint relevance analysis across multiple layers reveals key information flows in VGG16 networks. Overall, the NRM framework provides a general, mathematically grounded approach to understanding how modern NNs propagate information, offering a versatile and broadly applicable foundation for explainable artificial intelligence.

Problem

Research questions and friction points this paper is trying to address.

neural network interpretability

latent representations

relevance measure

explainable AI

internal inference mechanisms

Innovation

Methods, ideas, or system contributions that make the work stand out.

normalized relevance measure

explainable AI

neural network interpretability