What Do Temporal Graph Learning Models Learn?

📅 2025-10-10

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Current temporal graph learning models suffer from unreliable evaluation and exhibit unexpectedly strong performance against simple baselines—a paradox suggesting overreliance on spurious or superficial graph properties. Method: We propose the first interpretable analytical framework grounded in eight fundamental structural and temporal attributes (e.g., density, recency, homophily), validated via controlled synthetic and real-world datasets. Using ablation studies and attribution analysis, we quantitatively assess how well seven state-of-the-art models capture each attribute. Contribution/Results: We find models effectively learn local connectivity but consistently underperform in modeling temporal recency and higher-order temporal homophily—revealing critical architectural limitations. Our work shifts evaluation from opaque end-to-end accuracy toward attribute-level interpretability, establishing a new benchmark for model diagnosis and trustworthy deployment of temporal graph neural networks.

Technology Category

Application Category

📝 Abstract

Learning on temporal graphs has become a central topic in graph representation learning, with numerous benchmarks indicating the strong performance of state-of-the-art models. However, recent work has raised concerns about the reliability of benchmark results, noting issues with commonly used evaluation protocols and the surprising competitiveness of simple heuristics. This contrast raises the question of which properties of the underlying graphs temporal graph learning models actually use to form their predictions. We address this by systematically evaluating seven models on their ability to capture eight fundamental attributes related to the link structure of temporal graphs. These include structural characteristics such as density, temporal patterns such as recency, and edge formation mechanisms such as homophily. Using both synthetic and real-world datasets, we analyze how well models learn these attributes. Our findings reveal a mixed picture: models capture some attributes well but fail to reproduce others. With this, we expose important limitations. Overall, we believe that our results provide practical insights for the application of temporal graph learning models, and motivate more interpretability-driven evaluations in temporal graph learning research.

Problem

Research questions and friction points this paper is trying to address.

Evaluating temporal graph models' ability to capture structural and temporal attributes

Assessing reliability of benchmarks and competitiveness of simple heuristics

Analyzing model performance on edge formation mechanisms like homophily

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematically evaluating models on fundamental graph attributes

Using synthetic and real-world datasets for analysis

Focusing on interpretability-driven evaluations in temporal graph learning

🔎 Similar Papers

No similar papers found.