GDEGAN: Gaussian Dynamic Equivariant Graph Attention Network for Ligand Binding Site Prediction

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Accurately predicting protein–ligand binding sites remains a central challenge in structure-based drug design. This work proposes a novel equivariant graph neural network that introduces a Gaussian dynamic attention mechanism: replacing conventional dot-product attention with a Gaussian kernel, it adaptively modulates attention bandwidth using the mean and variance of local residue features and incorporates a learnable temperature parameter to enable context-aware importance modeling. Evaluated on the COACH420, HOLO4k, and PDBBind2020 benchmarks, the method substantially outperforms existing approaches, achieving a 37–66% improvement in distance-to-closest-contact (DCC) metrics and a 7–19% increase in distance-to-closest-atom (DCA) success rates.

Technology Category

Application Category

📝 Abstract

Accurate prediction of binding sites of a given protein, to which ligands can bind, is a critical step in structure-based computational drug discovery. Recently, Equivariant Graph Neural Networks (GNNs) have emerged as a powerful paradigm for binding site identification methods due to the large-scale availability of 3D structures of proteins via protein databases and AlphaFold predictions. The state-of-the-art equivariant GNN methods implement dot product attention, disregarding the variation in the chemical and geometric properties of the neighboring residues. To capture this variation, we propose GDEGAN (Gaussian Dynamic Equivariant Graph Attention Network), which replaces dot-product attention with adaptive kernels that recognize binding sites. The proposed attention mechanism captures variation in neighboring residues using statistics of their characteristic local feature distributions. Our mechanism dynamically computes neighborhood statistics at each layer, using local variance as an adaptive bandwidth parameter with learnable per-head temperatures, enabling each protein region to determine its own context-specific importance. GDEGAN outperforms existing methods with relative improvements of 37-66% in DCC and 7-19% DCA success rates across COACH420, HOLO4k, and PDBBind2020 datasets. These advances have direct application in accelerating protein-ligand docking by identifying potential binding sites for therapeutic target identification.

Problem

Research questions and friction points this paper is trying to address.

ligand binding site prediction

protein-ligand interaction

computational drug discovery

binding site identification

3D protein structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

Equivariant Graph Neural Networks

Gaussian Attention

Dynamic Bandwidth