🤖 AI Summary
Existing streaming graph partitioning methods support only vertex or edge partitioning and typically optimize a single objective, making it challenging to simultaneously address the diverse requirements of communication, computation, and memory in distributed GNN training. This work proposes SIGMA, a unified framework that, for the first time in a streaming setting, supports both edge-cut-oriented vertex partitioning and vertex-cut-oriented edge partitioning while jointly optimizing load balancing for both vertices and edges. By incorporating global structural information through a clustering-based preprocessing step, SIGMA significantly enhances partition quality without sacrificing streaming efficiency. Experiments on six large-scale graphs demonstrate that SIGMA outperforms existing streaming methods and achieves partition quality, training efficiency, and memory usage comparable to high-quality offline partitioners such as METIS and KaHIP, while remaining compatible with systems like DistGNN and DistDGL.
📝 Abstract
Distributed Graph Neural Network (GNN) training depends critically on how the underlying graph is partitioned across compute resources. Existing graph partitioners focus either on vertex partitioning or edge partitioning and typically optimize only a single communication objective (edge cut or vertex cut) under a single balance constraint (vertex balance or edge balance). We present SIGMA (Streaming Integrated Graph Partitioning with Multi-objective Awareness), a versatile streaming graph partitioner that supports both vertex and edge partitioning within a unified multi-objective, multi-constraint framework. Depending on the target distributed GNN system, SIGMA can be configured for edgecut-oriented vertex partitioning or vertex-cut-oriented edge partitioning while simultaneously accounting for both vertex and edge balancing. A clustering-based preprocessing stage incorporates global graph structure to improve partition quality while preserving the efficiency and scalability advantages of streaming partitioning. We evaluate SIGMA on six benchmark graphs spanning diverse domains and scales using two distributed GNN training systems: Dist-GNN (edge-partitioned) and DistDGL (vertex-partitioned). Across both settings, SIGMA consistently achieves strong performance, showing its ability to navigate complex trade-offs between partition quality, training efficiency, and memory consumption, frequently outperforming streaming baselines while remaining competitive with high-quality in-memory partitioners such as METIS, KaHIP and HEP. These results demonstrate that a unified streaming partitioner can effectively address the communication, compute, and memory challenges of distributed GNN training across fundamentally different system architectures.