Inference-friendly Graph Compression for Graph Neural Networks

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address the high inference overhead and deployment challenges of Graph Neural Networks (GNNs) on large-scale graphs, this paper proposes IFGC, an inference-friendly graph compression framework. It formally defines *GNN inference equivalence*—a novel notion ensuring compressed graphs preserve original GNN predictions for target nodes—and introduces three configurable compression paradigms: SPGC, (α,r)-compression, and anchor-based compression, jointly optimizing compression ratio, prediction accuracy, and target-node customization. Methodologically, IFGC integrates graph-structure equivalence modeling, hierarchical aggregation abstraction, error-bounded neighborhood merging, and critical anchor preservation. It further provides lightweight compression algorithms and zero- or low-overhead inference procedures. Evaluated on multiple large-scale graph datasets, IFGC achieves up to 8.2× inference speedup while maintaining >98% prediction accuracy; crucially, the compressed graphs are directly feedable into standard GNNs without decompression.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) have demonstrated promising performance in graph analysis. Nevertheless, the inference process of GNNs remains costly, hindering their applications for large graphs. This paper proposes inference-friendly graph compression (IFGC), a graph compression scheme to accelerate GNNs inference. Given a graph $G$ and a GNN $M$, an IFGC computes a small compressed graph $G_c$, to best preserve the inference results of $M$ over $G$, such that the result can be directly inferred by accessing $G_c$ with no or little decompression cost. (1) We characterize IFGC with a class of inference equivalence relation. The relation captures the node pairs in $G$ that are not distinguishable for GNN inference. (2) We introduce three practical specifications of IFGC for representative GNNs: structural preserving compression (SPGC), which computes $G_c$ that can be directly processed by GNN inference without decompression; ($alpha$, $r$)-compression, that allows for a configurable trade-off between compression ratio and inference quality, and anchored compression that preserves inference results for specific nodes of interest. For each scheme, we introduce compression and inference algorithms with guarantees of efficiency and quality of the inferred results. We conduct extensive experiments on diverse sets of large-scale graphs, which verifies the effectiveness and efficiency of our graph compression approaches.

Problem

Research questions and friction points this paper is trying to address.

Accelerate GNN inference via graph compression

Preserve inference results with minimal decompression

Balance compression ratio and inference quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Inference-friendly graph compression for GNNs

Structural preserving compression without decompression

Configurable trade-off between compression and quality

🔎 Similar Papers

Disentangled Condensation for Large-scale Graphs