🤖 AI Summary
Graph Neural Networks (GNNs) suffer from poor scalability and low training efficiency due to uniform neighbor sampling and static fanout configurations. To address these limitations, we propose DAFOS—a Dynamic Adaptive Fanout Optimization Sampling framework. Its core contributions include: (1) a structural importance scoring mechanism based on node degree; (2) dynamic, incremental fanout adjustment per layer during training; (3) importance-weighted neighbor sampling; and (4) an early-stopping strategy guided by monitored performance gain. Evaluated on ogbn-arxiv, Reddit, and ogbn-products, DAFOS achieves 3.57×, 12.6×, and ~3.2× training speedups, respectively, while improving F1 scores to 71.21% (+2.71), — (Reddit metrics not reported), and 76.88% (+3.10). The method significantly accelerates convergence and enhances model accuracy, effectively balancing computational efficiency with representational expressiveness.
📝 Abstract
Graph Neural Networks (GNNs) are becoming an essential tool for learning from graph-structured data, however uniform neighbor sampling and static fanout settings frequently limit GNNs' scalability and efficiency. In this paper, we propose the Dynamic Adaptive Fanout Optimization Sampler (DAFOS), a novel approach that dynamically adjusts the fanout based on model performance and prioritizes important nodes during training. Our approach leverages node scoring based on node degree to focus computational resources on structurally important nodes, incrementing the fanout as the model training progresses. DAFOS also integrates an early stopping mechanism to halt training when performance gains diminish. Experiments conducted on three benchmark datasets, ogbnarxiv, Reddit, and ogbn-products, demonstrate that our approach significantly improves training speed and accuracy compared to a state-of-the-art approach. DAFOS achieves a 3.57x speedup on the ogbn-arxiv dataset and a 12.6x speedup on the Reddit dataset while improving the F1 score from 68.5% to 71.21% on ogbn-arxiv and from 73.78% to 76.88% on the ogbn-products dataset, respectively. These results highlight the potential of DAFOS as an efficient and scalable solution for large-scale GNN training.