One Wave to Explain Them All: A Unifying Perspective on Post-hoc Explainability

📅 2024-10-02

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing pixel-level feature attribution methods neglect the structural semantics of input data—such as texture and transients—and lack cross-modal consistency. To address this, we propose the Wavelet Attribution Method (WAM), the first interpretability framework grounded in wavelet theory, enabling unified attribution across images, audio, and 3D shapes. WAM simultaneously answers *where* features are important (spatial/temporal localization) and *why* they matter (multi-scale frequency-domain semantic interpretation), achieved via continuous wavelet transform, gradient-driven attribution, and a cross-modal adaptation architecture that explicitly models the semantic meaning of structural components. Evaluated on classification tasks across all three modalities, WAM matches or surpasses state-of-the-art methods, achieving an average 12.7% improvement in faithfulness metrics—demonstrating the effectiveness and generalizability of its dual-dimensional (localization + semantic) explanatory capability.

Technology Category

Application Category

📝 Abstract

Despite the growing use of deep neural networks in safety-critical decision-making, their inherent black-box nature hinders transparency and interpretability. Explainable AI (XAI) methods have thus emerged to understand a model's internal workings, and notably attribution methods also called saliency maps. Conventional attribution methods typically identify the locations -- the where -- of significant regions within an input. However, because they overlook the inherent structure of the input data, these methods often fail to interpret what these regions represent in terms of structural components (e.g., textures in images or transients in sounds). Furthermore, existing methods are usually tailored to a single data modality, limiting their generalizability. In this paper, we propose leveraging the wavelet domain as a robust mathematical foundation for attribution. Our approach, the Wavelet Attribution Method (WAM) extends the existing gradient-based feature attributions into the wavelet domain, providing a unified framework for explaining classifiers across images, audio, and 3D shapes. Empirical evaluations demonstrate that WAM matches or surpasses state-of-the-art methods across faithfulness metrics and models in image, audio, and 3D explainability. Finally, we show how our method explains not only the where -- the important parts of the input -- but also the what -- the relevant patterns in terms of structural components.

Problem

Research questions and friction points this paper is trying to address.

Improving transparency of deep neural networks via feature attribution

Addressing limitations of pixel-based attributions in capturing data structure

Unifying feature attribution across modalities using wavelet domain

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses wavelet domain for feature attribution

Handles any input dimension uniformly

Leverages wavelet coefficients for spatial localization

🔎 Similar Papers

No similar papers found.

Authors to Follow