🤖 AI Summary
A systematic survey and theoretical foundation for graph neural network (GNN)-enhanced database systems remains lacking. This paper presents the first methodology taxonomy for GNNs in database systems, introducing a dual-dimensional classification framework—“relational databases vs. graph databases”—that spans core tasks including query performance prediction, query optimization, text-to-SQL translation, and graph query acceleration. Methodologically, it innovatively integrates query execution plan modeling, SQL semantic encoding, subgraph matching representation learning, and workload characterization techniques. Through a comprehensive review of over 60 studies, the work empirically demonstrates average query latency reductions of 30–50%, top-1 query plan selection accuracy reaching 89%, and identifies cross-paradigm data integration as a critical future direction.
📝 Abstract
Graph neural networks (GNNs) are powerful deep learning models for graph-structured data, demonstrating remarkable success across diverse domains. Recently, the database (DB) community has increasingly recognized the potentiality of GNNs, prompting a surge of researches focusing on improving database systems through GNN-based approaches. However, despite notable advances, There is a lack of a comprehensive review and understanding of how GNNs could improve DB systems. Therefore, this survey aims to bridge this gap by providing a structured and in-depth overview of GNNs for DB systems. Specifically, we propose a new taxonomy that classifies existing methods into two key categories: (1) Relational Databases, which includes tasks like performance prediction, query optimization, and text-to-SQL, and (2) Graph Databases, addressing challenges like efficient graph query processing and graph similarity computation. We systematically review key methods in each category, highlighting their contributions and practical implications. Finally, we suggest promising avenues for integrating GNNs into Database systems.