MQ-GNN: A Multi-Queue Pipelined Architecture for Scalable and Efficient GNN Training

📅 2026-01-08
🏛️ IEEE Access
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiencies in graph neural network (GNN) training caused by inefficient minibatch generation, data transfer bottlenecks, and synchronization overhead across multiple GPUs, which collectively lead to low hardware utilization. To overcome these challenges, the authors propose MQ-GNN, a framework featuring a multi-queue pipelined architecture that interleaves different training stages. MQ-GNN introduces the Ready-to-Update asynchronous consistency model (RaCoM) to enable asynchronous gradient sharing with adaptive periodic synchronization. It further incorporates a global neighbor sampling cache and an adaptive queue sizing strategy to enhance throughput while preserving model consistency. Experimental results on four large-scale datasets demonstrate that MQ-GNN achieves up to 4.6× speedup over ten baseline models, improves GPU utilization by 30%, and maintains competitive accuracy.

Technology Category

Application Category

📝 Abstract
Graph Neural Networks (GNNs) are powerful tools for learning graph-structured data, but their scalability is hindered by inefficient mini-batch generation, data transfer bottlenecks, and costly inter-GPU synchronization. Existing training frameworks fail to overlap these stages, leading to suboptimal resource utilization. This paper proposes MQ-GNN, a multi-queue pipelined framework that maximizes training efficiency by interleaving GNN training stages and optimizing resource utilization. MQ-GNN introduces Ready-to-Update Asynchronous Consistent Model (RaCoM), which enables asynchronous gradient sharing and model updates while ensuring global consistency through adaptive periodic synchronization. Additionally, it employs global neighbor sampling with caching to reduce data transfer overhead and an adaptive queue-sizing strategy to balance computation and memory efficiency. Experiments on four large-scale datasets and ten baseline models demonstrate that MQ-GNN achieves up to ${4.6\,\times }$ faster training time and 30% improved GPU utilization while maintaining competitive accuracy. These results establish MQ-GNN as a scalable and efficient solution for multi-GPU GNN training. The code is available at MQ-GNN.
Problem

Research questions and friction points this paper is trying to address.

Graph Neural Networks
scalability
multi-GPU training
data transfer bottleneck
inter-GPU synchronization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Queue Pipeline
Asynchronous Consistent Model
Global Neighbor Sampling
Adaptive Synchronization
Scalable GNN Training
🔎 Similar Papers
No similar papers found.
I
Irfan Ullah
Department of Computer Science and Engineering, Kyung Hee University (Global Campus), Republic of Korea
Young-Koo Lee
Young-Koo Lee
Kyung Hee University
Big Data Processing and AnalysisData MiningDatabase Systems