🤖 AI Summary
This work addresses the challenges of redundant computation, high communication overhead, and latency in traditional graph neural network (GNN) inference on dynamic graphs, where frequent structural and feature changes occur. The authors propose the first general-purpose streaming GNN incremental inference framework that models GNN aggregation semantics to update only the neighborhood embeddings affected by graph modifications. The framework supports vertex/edge insertions and deletions as well as feature updates, and is compatible with both single-machine and distributed deployments. It enables deterministic, low-latency, high-throughput inference without the non-determinism introduced by sampling-based methods. Experiments show that on a single machine, the system achieves 56K and 7.6K updates per second on the Arxiv and Products graphs, respectively, yielding 2.2–24× throughput improvements; in distributed settings, it delivers approximately 25× higher throughput and reduces communication overhead by 20×.
📝 Abstract
Real-world graphs are dynamic, with frequent updates to their structure and features due to evolving vertex and edge properties. These continual changes pose significant challenges for efficient inference in graph neural networks (GNNs). Existing vertex-wise and layer-wise inference approaches are ill-suited for dynamic graphs, as they incur redundant computations, large neighborhood traversals, and high communication costs, especially in distributed settings. Additionally, while sampling-based approaches can be adopted to approximate final layer embeddings, these are often not preferred in critical applications due to their non-determinism. These limitations hinder low-latency inference required in real-time applications. To address this, we propose RIPPLE++, a framework for streaming GNN inference that efficiently and accurately updates embeddings in response to changes in the graph structure or features. RIPPLE++ introduces a generalized incremental programming model that captures the semantics of GNN aggregation functions and incrementally propagates updates to affected neighborhoods. RIPPLE++ accommodates all common graph updates, including vertex/edge addition/deletions and vertex feature updates. RIPPLE++ supports both single-machine and distributed deployments. On a single machine, it achieves up to $56$K updates/sec on sparse graphs like Arxiv ($169$K vertices, $1.2$M edges), and about $7.6$K updates/sec on denser graphs like Products ($2.5$M vertices, $123.7$M edges), with latencies of $0.06$--$960$ms, and outperforming state-of-the-art baselines by $2.2$--$24\times$ on throughput. In distributed settings, RIPPLE++ offers up to $\approx25\times$ higher throughput and $20\times$ lower communication costs compared to recomputing baselines.