RADE: Random Add-Drop Edge as a Regularizer

📅 2026-05-30

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Graph neural networks are prone to overfitting and the over-squashing of long-range information, and existing augmentation or rewiring strategies struggle to address both issues simultaneously. This work proposes RADE, a novel regularization framework that unifies random edge deletion and addition within a training–inference aligned paradigm. By jointly adding and removing edges, RADE enhances graph connectivity to mitigate over-squashing while providing effective regularization against overfitting. Furthermore, it incorporates an adaptive mini-batch gradient norm balancing algorithm that requires no additional hyperparameter tuning. Experimental results demonstrate that RADE significantly outperforms current methods on standard node and graph classification benchmarks. Ablation studies confirm the effectiveness of the alignment mechanism, the adaptive strategy, and the complementary design of edge addition and deletion.

📝 Abstract

Graph Neural Networks (GNNs) suffer from overfitting and over-squashing of long-range information. Stochastic graph augmentations (e.g., edge deletion) regularize training against overfitting but can introduce train-inference misalignment and do not improve over-squashing. In contrast, rewiring methods improve connectivity to mitigate over-squashing, but are not designed to regularize training. We propose Random Add-Drop Edge (RADE), a stochastic graph augmentation method that jointly drops and adds edges to address both overfitting and over-squashing simultaneously. RADE is provably designed to align training and inference so that random augmentations regularize training without distribution shift, while supporting long-range communication at inference. We further propose and study a mini-batch gradient-norm balancing algorithm that adapts deletion and addition rates during training, rendering RADE hyperparameter-free in practice. Experiments on node- and graph-classification benchmarks show that RADE is a strong regularizer and mitigates over-squashing. Ablations support the roles of train-inference alignment, adaptive rate selection, and the complementary effects of random edge deletion and edge addition.

Problem

Research questions and friction points this paper is trying to address.

overfitting

over-squashing

Graph Neural Networks

stochastic graph augmentation

train-inference misalignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Add-Drop Edge

Graph Neural Networks

over-squashing