🤖 AI Summary
This paper addresses representation learning for dynamic text-attributed graphs (DyTAGs). The proposed method jointly models structural, temporal, and textual information. Its core contributions are: (1) a novel large language model (LLM)-driven knowledge distillation framework that transfers LLMs’ semantic understanding of neighborhood text to a lightweight spatiotemporal graph neural network (GNN); and (2) an edge-level temporal encoding mechanism enabling fine-grained spatiotemporal joint modeling within graph convolution. The approach integrates LLMs, knowledge distillation, dynamic GNNs, timestamp encoding, and text embedding techniques. Evaluated on six real-world DyTAG datasets, it achieves improvements of +12.7% in F1-score for future link prediction and +9.3% in AUC for edge classification, significantly outperforming state-of-the-art baselines.
📝 Abstract
Dynamic Text-Attributed Graphs (DyTAGs) have numerous real-world applications, e.g. social, collaboration, citation, communication, and review networks. In these networks, nodes and edges often contain text descriptions, and the graph structure can evolve over time. Future link prediction, edge classification, relation generation, and other downstream tasks on DyTAGs require powerful representations that encode structural, temporal, and textual information. Although graph neural networks (GNNs) excel at handling structured data, encoding temporal information within dynamic graphs remains a significant challenge. In this work, we propose LLM-driven Knowledge Distillation for Dynamic Text Attributed Graph (LKD4DyTAG) with temporal encoding to address these challenges. We use a simple, yet effective approach to encode temporal information in edges so that graph convolution can simultaneously capture both temporal and structural information in the hidden representations. To leverage LLM's text processing capabilities for learning richer representations on DyTAGs, we distill knowledge from LLM-driven edge representations (based on a neighborhood's text attributes) into saptio-temporal representations using a lightweight GNN model that encodes temporal and structural information. The objective of knowledge distillation enables the GNN to learn representations that more effectively encode the available structural, temporal, and textual information in DyTAG. We conducted extensive experimentation on six real-world DyTAG datasets to verify the effectiveness of our approach LKD4DyTAG for future link prediction and edge classification task. The results show that our approach significantly improves the performance of downstream tasks compared to the baseline models.