🤖 AI Summary
Scalability bottlenecks in graph contrastive learning (GCL) on large-scale graphs arise from computationally intensive message passing and quadratic complexity of conventional contrastive losses. To address this, we propose an efficient unsupervised graph representation learning framework. Our method introduces three key innovations: (1) a compact node-set network that hierarchically preserves global graph structure; (2) a linear-time kernelized graph community contrastive loss—replacing pairwise node-level contrast with scalable community-level contrast; and (3) a decoupled GNN architecture enhanced by knowledge distillation to jointly optimize training efficiency and inference performance. Integrating graph clustering with a dual-kernel mechanism, our approach balances structural expressiveness and computational efficiency. Evaluated on 16 real-world benchmark datasets, our method consistently outperforms state-of-the-art baselines, achieving significant speedups in both training and inference—without sacrificing, and often improving, representation quality.
📝 Abstract
Graph Contrastive Learning (GCL) has emerged as a powerful paradigm for training Graph Neural Networks (GNNs) in the absence of task-specific labels. However, its scalability on large-scale graphs is hindered by the intensive message passing mechanism of GNN and the quadratic computational complexity of contrastive loss over positive and negative node pairs. To address these issues, we propose an efficient GCL framework that transforms the input graph into a compact network of interconnected node sets while preserving structural information across communities. We firstly introduce a kernelized graph community contrastive loss with linear complexity, enabling effective information transfer among node sets to capture hierarchical structural information of the graph. We then incorporate a knowledge distillation technique into the decoupled GNN architecture to accelerate inference while maintaining strong generalization performance. Extensive experiments on sixteen real-world datasets of varying scales demonstrate that our method outperforms state-of-the-art GCL baselines in both effectiveness and scalability.