Towards Scalable and Deep Graph Neural Networks via Noise Masking

📅 2024-12-19

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

To address key bottlenecks in Graph Neural Networks (GNNs) on large-scale graphs—including high computational/storage overhead, oversmoothing induced by deep propagation, and expensive preprocessing—this paper proposes the Noise-Masked Random Walk (RMask) module. For the first time, it identifies propagation noise—not merely depth—as the primary cause of oversmoothing, and introduces a lightweight noise modeling framework coupled with a feature propagation masking mechanism. Operating within the P-C simplification paradigm, RMask enables continuous and controllable deep feature propagation. The method achieves an optimal trade-off between architectural simplicity and representational capacity. Evaluated on six real-world graph datasets, it consistently outperforms existing simplification approaches in accuracy while maintaining low preprocessing cost and end-to-end efficiency. Thus, RMask unifies scalability and deep expressive power in GNNs.

Technology Category

Application Category

📝 Abstract

In recent years, Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. However, scaling them to large graphs is challenging due to the high computational and storage costs of repeated feature propagation and non-linear transformation during training. One commonly employed approach to address this challenge is model-simplification, which only executes the Propagation (P) once in the pre-processing, and Combine (C) these receptive fields in different ways and then feed them into a simple model for better performance. Despite their high predictive performance and scalability, these methods still face two limitations. First, existing approaches mainly focus on exploring different C methods from the model perspective, neglecting the crucial problem of performance degradation with increasing P depth from the data-centric perspective, known as the over-smoothing problem. Second, pre-processing overhead takes up most of the end-to-end processing time, especially for large-scale graphs. To address these limitations, we present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works. This module enables the exploration of deeper GNNs while preserving their scalability. Unlike the previous model-simplification works, we focus on continuous P and found that the noise existing inside each P is the cause of the over-smoothing issue, and use the efficient masking mechanism to eliminate them. Experimental results on six real-world datasets demonstrate that model-simplification works equipped with RMask yield superior performance compared to their original version and can make a good trade-off between accuracy and efficiency.

Problem

Research questions and friction points this paper is trying to address.

Addressing scalability and depth limitations in Graph Neural Networks

Mitigating over-smoothing in deep graph neural networks

Reducing pre-processing overhead for large-scale graph tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Random walk with noise masking (RMask) module

Eliminates noise in continuous propagation (P)

Improves scalability and depth of GNNs

🔎 Similar Papers

An end-to-end attention-based approach for learning on graphs