Massive Activations in Graph Neural Networks: Decoding Attention for Domain-Dependent Interpretability

📅 2024-09-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies the pervasive “Massive Activations” (MAs) phenomenon in edge-attention graph neural networks (GNNs): MAs are not anomalous activations but encode critical domain semantics—e.g., single/double bond patterns in molecular graphs. We establish, for the first time, a causal link between attention mechanisms and MA generation, and propose a rigorous, statistically grounded definition and detection framework for MAs, incorporating domain-specific constraints. To enhance interpretability, we design an ablation-driven attribution redistribution method that transforms MAs into explainable, posterior attribution scores. Empirical evaluation on ZINC, TOX21, and PROTEINS benchmarks demonstrates that MAs substantially improve alignment between GNN internal activations and chemical priors. Our approach introduces a novel paradigm for interpretable AI in structured domains such as molecular modeling.

Technology Category

Application Category

📝 Abstract
Graph Neural Networks (GNNs) have become increasingly popular for effectively modeling graph-structured data, and attention mechanisms have been pivotal in enabling these models to capture complex patterns. In our study, we reveal a critical yet underexplored consequence of integrating attention into edge-featured GNNs: the emergence of Massive Activations (MAs) within attention layers. By developing a novel method for detecting MAs on edge features, we show that these extreme activations are not only activation anomalies but encode domain-relevant signals. Our post-hoc interpretability analysis demonstrates that, in molecular graphs, MAs aggregate predominantly on common bond types (e.g., single and double bonds) while sparing more informative ones (e.g., triple bonds). Furthermore, our ablation studies confirm that MAs can serve as natural attribution indicators, reallocating to less informative edges. Our study assesses various edge-featured attention-based GNN models using benchmark datasets, including ZINC, TOX21, and PROTEINS. Key contributions include (1) establishing the direct link between attention mechanisms and MAs generation in edge-featured GNNs, (2) developing a robust definition and detection method for MAs enabling reliable post-hoc interpretability. Overall, our study reveals the complex interplay between attention mechanisms, edge-featured GNNs model, and MAs emergence, providing crucial insights for relating GNNs internals to domain knowledge.
Problem

Research questions and friction points this paper is trying to address.

Identifies Massive Activations in edge-featured GNNs.
Links attention mechanisms to domain-relevant signal encoding.
Develops methods for detecting and interpreting activation anomalies.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Detects Massive Activations in edge-featured GNNs
Links attention mechanisms to domain-relevant signals
Provides interpretability via MA-based attribution indicators
🔎 Similar Papers
No similar papers found.
Lorenzo Bini
Lorenzo Bini
PhD Candidate PhD Candidate at the University of Geneva
Graph Representation LearningSelf-Supervised Learning3D GenomicsFlow Matching
M
M. Sorbi
University of Geneva, Research Institute for Statistics and Information Science, Switzerland
S
Stéphane Marchand-Maillet
University of Geneva, Department of Computer Science, Switzerland