GCoDE: Efficient Device-Edge Co-Inference for GNNs via Architecture-Mapping Co-Search

📅 2025-12-05

🏛️ IEEE transactions on computers

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

The high computational overhead of Graph Neural Network (GNN) inference on resource-constrained edge devices poses a fundamental challenge. To address this, we propose the first end-edge collaborative, automated architecture–mapping co-design framework. Our method innovatively models communication as an explicit operator and jointly optimizes hardware architecture and deployment mapping within a unified search space. We introduce a system-aware energy consumption prediction model and a constraint-guided stochastic search strategy to enable efficient, energy-driven architecture exploration. Evaluated against state-of-the-art approaches, our framework achieves up to 44.9× speedup and 98.2% energy reduction, converges to optimal solutions within 1.5 hours, and supports rapid cross-platform deployment. Key contributions include: (1) a GNN-specific end-edge collaborative execution engine; (2) an explicit communication modeling mechanism; and (3) a lightweight energy-constrained optimization algorithm.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) have emerged as the state-of-the-art graph learning method. However, achieving efficient GNN inference on edge devices poses significant challenges, limiting their application in real-world edge scenarios. This is due to the high computational cost of GNNs and limited hardware resources on edge devices, which prevent GNN inference from meeting real-time and energy requirements. As an emerging paradigm, device-edge co-inference shows potential for improving inference efficiency and reducing energy consumption on edge devices. Despite its potential, research on GNN device-edge co-inference remains scarce, and our findings show that traditional model partitioning methods are ineffective for GNNs. To address this, we propose GCoDE, the first automatic framework for <underline>G</underline>NN architecture-mapping <underline>Co</underline>-design and deployment on <underline>D</underline>evice-<underline>E</underline>dge hierarchies. By abstracting the device communication process into an explicit operation, GCoDE fuses the architecture and mapping scheme in a unified design space for joint optimization. Additionally, GCoDE’s system performance awareness enables effective evaluation of architecture efficiency across diverse heterogeneous systems. By analyzing the energy consumption of various GNN operations, GCoDE introduces an energy prediction method that improves energy assessment accuracy and identifies energy-efficient solutions. Using a constraint-based random search strategy, GCoDE identifies the optimal solution in 1.5 hours, balancing accuracy and efficiency. Moreover, the integrated co-inference engine in GCoDE enables efficient deployment and execution of GNN co-inference. Experimental results show that GCoDE can achieve up to <inline-formula><tex-math notation="LaTeX">$44.9f{ imes}$</tex-math><alternatives><mml:math><mml:mn>44.9</mml:mn><mml:mrow><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="yang-ieq1-3624262.gif"/></alternatives></inline-formula> speedup and 98.2% energy reduction compared to existing approaches across diverse applications and system configurations.

Problem

Research questions and friction points this paper is trying to address.

Efficient GNN inference on edge devices is challenging

Device-edge co-inference for GNNs lacks effective methods

Optimizing GNN architecture and mapping for energy and speed

Innovation

Methods, ideas, or system contributions that make the work stand out.

Device-edge co-inference framework for GNNs

Joint optimization of architecture and mapping scheme

Energy prediction method for efficient solution identification

🔎 Similar Papers

No similar papers found.

Authors to Follow