GNN-based Anchor Embedding for Exact Subgraph Matching

📅 2025-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of theoretical guarantees for exactness in subgraph matching for graph data management, this paper proposes GNN-AE—the first verifiably exact GNN-based embedding framework. Its core innovation introduces *anchors*, *anchor graphs*, and *anchor paths*, reformulating exact subgraph matching as a search for anchor structures within the embedding space. We design a GNN embedding mechanism with theoretically proven completeness (i.e., zero false negatives) and integrate it with a matching-growth algorithm and cost-driven depth-first search query optimization. Experiments on six real-world and three synthetic datasets demonstrate that GNN-AE achieves 100% recall—significantly outperforming state-of-the-art approximate methods—while delivering up to 8.2× speedup in end-to-end matching latency. To our knowledge, this is the first approach to achieve both rigorous theoretical correctness and practical efficiency for exact subgraph matching.

Technology Category

Application Category

📝 Abstract
Subgraph matching query is a classic problem in graph data management and has a variety of real-world applications, such as discovering structures in biological or chemical networks, finding communities in social network analysis, explaining neural networks, and so on. To further solve the subgraph matching problem, several recent advanced works attempt to utilize deep-learning-based techniques to handle the subgraph matching query. However, most of these works only obtain approximate results for subgraph matching without theoretical guarantees of accuracy. In this paper, we propose a novel and effective graph neural network (GNN)-based anchor embedding framework (GNN-AE), which allows exact subgraph matching. Unlike GNN-based approximate subgraph matching approaches that only produce inexact results, in this paper, we pioneer a series of concepts related to anchor (including anchor, anchor graph/path, etc.) in subgraph matching and carefully devise the anchor (graph) embedding technique based on GNN models. We transform the subgraph matching problem into a search problem in the embedding space via the anchor (graph&path) embedding techniques. With the proposed anchor matching mechanism, GNN-AE can guarantee subgraph matching has no false dismissals. We design an efficient matching growth algorithm, which can retrieve the locations of all exact matches in parallel. We also propose a cost-model-based DFS query plan to enhance the parallel matching growth algorithm. Through extensive experiments on 6 real-world and 3 synthetic datasets, we confirm the effectiveness and efficiency of our GNN-AE approach for exact subgraph matching.
Problem

Research questions and friction points this paper is trying to address.

Exact subgraph matching via GNN
Anchor embedding ensures no false dismissals
Efficient parallel matching growth algorithm
Innovation

Methods, ideas, or system contributions that make the work stand out.

GNN-based anchor embedding
Exact subgraph matching
Parallel matching growth algorithm
🔎 Similar Papers
No similar papers found.
B
Bin Yang
Harbin Institute of Technology, Harbin, Heilongjiang, China
Zhaonian Zou
Zhaonian Zou
Harbin Institute of Technology, China
DatabasesData Mining
J
Jianxiong Ye
Harbin Institute of Technology, Harbin, Heilongjiang, China