🤖 AI Summary
Structured constraints in relational databases (RDBs) hinder deep learning models from effectively capturing deep, multi-table dependencies; existing approaches rely solely on primary-foreign key relationships to construct unary joins or graphs, neglecting implicit composite semantic relationships across tables.
Method: We propose SRP, a unified prediction framework that introduces, for the first time, a **content-based cross-table retrieval mechanism**, jointly leveraging feature synthesis and graph neural network (GNN) message passing to model both atomic and composite dependencies simultaneously. SRP breaks away from conventional join/graph construction paradigms, significantly expanding the model’s receptive field over table structures.
Contribution/Results: Experiments on five real-world datasets demonstrate that SRP consistently outperforms state-of-the-art baselines. Ablation studies validate the effectiveness of each component, confirming SRP’s strong generalization capability and practical applicability in industrial settings.
📝 Abstract
Relational databases (RDBs) have become the industry standard for storing massive and heterogeneous data. However, despite the widespread use of RDBs across various fields, the inherent structure of relational databases hinders their ability to benefit from flourishing deep learning methods. Previous research has primarily focused on exploiting the unary dependency among multiple tables in a relational database using the primary key - foreign key relationships, either joining multiple tables into a single table or constructing a graph among them, which leaves the implicit composite relations among different tables and a substantial potential of improvement for predictive modeling unexplored. In this paper, we propose SRP, a unified predictive modeling framework that synthesizes features using the unary dependency, retrieves related information to capture the composite dependency, and propagates messages across a constructed graph to learn adjacent patterns for prediction on relation databases. By introducing a new retrieval mechanism into RDB, SRP is designed to fully capture both the unary and the composite dependencies within a relational database, thereby enhancing the receptive field of tabular data prediction. In addition, we conduct a comprehensive analysis on the components of SRP, offering a nuanced understanding of model behaviors and practical guidelines for future applications. Extensive experiments on five real-world datasets demonstrate the effectiveness of SRP and its potential applicability in industrial scenarios. The code is released at https://github.com/NingLi670/SRP.