🤖 AI Summary
Weak interpretability of deep learning models, coupled with coarse structural resolution and semantic ambiguity in attribution masks, hinders human understanding of model decisions. To address this, we introduce implicit neural representations (INRs) to visual attribution explanation—marking the first such application. We propose a conditional INR framework that explicitly enforces area constraints and incorporates an iterative optimization mechanism to generate multiple non-overlapping, structurally coherent attribution masks. By integrating coordinate-based encoding into the implicit network and extending perturbations to extremal values, our method significantly improves semantic consistency and spatial compactness of masks. Experiments demonstrate that the approach effectively disentangles the joint discriminative mechanisms underlying object identity and contextual texture, revealing multi-source decision rationales employed by the model. This enhances both human comprehension of black-box model behavior and trust in its predictions.
📝 Abstract
Explaining deep learning models in a way that humans can easily understand is essential for responsible artificial intelligence applications. Attribution methods constitute an important area of explainable deep learning. The attribution problem involves finding parts of the network's input that are the most responsible for the model's output. In this work, we demonstrate that implicit neural representations (INRs) constitute a good framework for generating visual explanations. Firstly, we utilize coordinate-based implicit networks to reformulate and extend the extremal perturbations technique and generate attribution masks. Experimental results confirm the usefulness of our method. For instance, by proper conditioning of the implicit network, we obtain attribution masks that are well-behaved with respect to the imposed area constraints. Secondly, we present an iterative INR-based method that can be used to generate multiple non-overlapping attribution masks for the same image. We depict that a deep learning model may associate the image label with both the appearance of the object of interest as well as with areas and textures usually accompanying the object. Our study demonstrates that implicit networks are well-suited for the generation of attribution masks and can provide interesting insights about the performance of deep learning models.