Unveiling Molecular Moieties through Hierarchical Grad-CAM Graph Explainability

📅 2024-01-29

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

In drug discovery, graph neural network (GNN)-based virtual screening lacks molecule-level interpretability, hindering mechanistic understanding and rational design. To address this, we propose the Hierarchical Grad-CAM Explanation (HGE) framework—a novel hierarchical attribution method that achieves decoupled, three-tier interpretability at the atomic, ring, and whole-molecule levels, explicitly leveraging GNN message-passing dynamics to quantify substructural importance. Evaluated on 20 kinase targets, HGE-enhanced GNNs achieve state-of-the-art virtual screening performance. Crucially, HGE attributions align closely with literature-reported drug–target interaction motifs—successfully recapitulating key binding fragments, including hinge-region hydrogen-bonding groups and hydrophobic pocket substituents. This work establishes a verifiable, multi-scale attribution paradigm for trustworthy GNN deployment in drug discovery.

Technology Category

Application Category

📝 Abstract

Background: Virtual Screening (VS) has become an essential tool in drug discovery, enabling the rapid and cost-effective identification of potential bioactive molecules. Among recent advancements, Graph Neural Networks (GNNs) have gained prominence for their ability to model complex molecular structures using graph-based representations. However, the integration of explainable methods to elucidate the specific contributions of molecular substructures to biological activity remains a significant challenge. This limitation hampers both the interpretability of predictive models and the rational design of novel therapeutics.\ Results: We trained 20 GNN models on a dataset of small molecules with the goal of predicting their activity on 20 distinct protein targets from the Kinase family. These classifiers achieved state-of-the-art performance in virtual screening tasks, demonstrating high accuracy and robustness on different targets. Building upon these models, we implemented the Hierarchical Grad-CAM graph Explainer (HGE) framework, enabling an in-depth analysis of the molecular moieties driving protein-ligand binding stabilization. HGE exploits Grad-CAM explanations at the atom, ring, and whole-molecule levels, leveraging the message-passing mechanism to highlight the most relevant chemical moieties. Validation against experimental data from the literature confirmed the ability of the explainer to recognize a molecular pattern of drugs and correctly annotate them to the known target. Conclusion: Our approach may represent a valid support to shorten both the screening and the hit discovery process. Detailed knowledge of the molecular substructures that play a role in the binding process can help the computational chemist to gain insights into the structure optimization, as well as in drug repurposing tasks.

Problem

Research questions and friction points this paper is trying to address.

Explain molecular substructure contributions to bioactivity

Improve interpretability of GNNs in drug discovery

Identify key molecular moieties for protein-ligand binding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Grad-CAM graph Explainer (HGE) framework

GNN models for virtual screening tasks

Multi-level Grad-CAM explanations for molecular moieties

🔎 Similar Papers

MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation