🤖 AI Summary
To address the dual requirement for accuracy and interpretability in safety-critical domains, this paper proposes WASUP—a novel intrinsically interpretable neural network. WASUP employs class-discriminative support vectors as human-understandable prototypes, leverages the B-Cos transformation to enforce weight-input alignment, and performs classification via similarity-based aggregation—thereby enabling both local and global faithful explanations by design. It is the first work to unify case-based reasoning, support vector representation, and weight-input alignment within a single neural architecture; we formally prove that its explanations strictly satisfy the faithfulness axiom. Evaluated on Stanford Dogs, Pascal VOC, and RSNA pathology detection tasks, WASUP achieves state-of-the-art accuracy comparable to black-box models while delivering verifiable, trustworthy explanations—substantially enhancing model reliability in high-stakes applications.
📝 Abstract
The deployment of deep learning models in critical domains necessitates a balance between high accuracy and interpretability. We introduce WASUP, an inherently interpretable neural network that provides local and global explanations of its decision-making process. We prove that these explanations are faithful by fulfilling established axioms for explanations. Leveraging the concept of case-based reasoning, WASUP extracts class-representative support vectors from training images, ensuring they capture relevant features while suppressing irrelevant ones. Classification decisions are made by calculating and aggregating similarity scores between these support vectors and the input's latent feature vector. We employ B-Cos transformations, which align model weights with inputs to enable faithful mappings of latent features back to the input space, facilitating local explanations in addition to global explanations of case-based reasoning. We evaluate WASUP on three tasks: fine-grained classification on Stanford Dogs, multi-label classification on Pascal VOC, and pathology detection on the RSNA dataset. Results indicate that WASUP not only achieves competitive accuracy compared to state-of-the-art black-box models but also offers insightful explanations verified through theoretical analysis. Our findings underscore WASUP's potential for applications where understanding model decisions is as critical as the decisions themselves.