🤖 AI Summary
Existing graph models for relational data inadequately capture spatiotemporal dependencies: they often neglect temporal dynamics, treating time merely as a filtering constraint, and are restricted to single-task prediction. To address this, we propose T-GT (Temporal Graph Transformer), a unified framework that enhances global contextual awareness via temporal subgraph sampling and introduces a latent bottleneck mechanism based on cross-attention to jointly model heterogeneous entities and relations across space and time. T-GT supports multi-task joint learning and flexible decoding. Evaluated on benchmark datasets—including RelBench, SALT, and CTU—T-GT achieves state-of-the-art performance on multi-task relational prediction, significantly outperforming prior methods. Results demonstrate its superior capability in capturing long-range spatiotemporal dependencies, strong cross-domain generalizability, and architectural scalability.
📝 Abstract
In domains such as healthcare, finance, and e-commerce, the temporal dynamics of relational data emerge from complex interactions-such as those between patients and providers, or users and products across diverse categories. To be broadly useful, models operating on these data must integrate long-range spatial and temporal dependencies across diverse types of entities, while also supporting multiple predictive tasks. However, existing graph models for relational data primarily focus on spatial structure, treating temporal information merely as a filtering constraint to exclude future events rather than a modeling signal, and are typically designed for single-task prediction. To address these gaps, we introduce a temporal subgraph sampler that enhances global context by retrieving nodes beyond the immediate neighborhood to capture temporally relevant relationships. In addition, we propose the Relational Graph Perceiver (RGP), a graph transformer architecture for relational deep learning that leverages a cross-attention-based latent bottleneck to efficiently integrate information from both structural and temporal contexts. This latent bottleneck integrates signals from different node and edge types into a common latent space, enabling the model to build global context across the entire relational system. RGP also incorporates a flexible cross-attention decoder that supports joint learning across tasks with disjoint label spaces within a single model. Experiments on RelBench, SALT, and CTU show that RGP delivers state-of-the-art performance, offering a general and scalable solution for relational deep learning with support for diverse predictive tasks.