🤖 AI Summary
State-of-the-art link prediction (LP) models achieve strong performance on standard i.i.d. benchmarks, yet their underlying i.i.d. assumption rarely holds in practice—new links often emerge from subgraph structures whose distribution differs significantly from that of training data. Method: We formally define and construct *link-level distribution shift*, proposing a controllable non-i.i.d. data partitioning strategy grounded in graph structural properties (e.g., degree distribution, clustering coefficient, path length). We further design a structure-driven LP generalization evaluation framework integrating invariant learning and structural regularization. Contribution/Results: We uncover a counterintuitive phenomenon: mainstream LP models (GNNs, MLPs, KG embeddings) exhibit substantially worse generalization under distribution shift than simple heuristic baselines. Empirical evaluation shows distribution shift degrades SOTA model AUC by 12.7% on average. We release LPStructGen—the first dedicated benchmark for LP generalization—and a unified experimental framework.
📝 Abstract
Recently, multiple models proposed for link prediction (LP) demonstrate impressive results on benchmark datasets. However, many popular benchmark datasets often assume that dataset samples are drawn from the same distribution (i.e., IID samples). In real-world situations, this assumption is often incorrect; since uncontrolled factors may lead train and test samples to come from separate distributions. To tackle the distribution shift problem, recent work focuses on creating datasets that feature distribution shifts and designing generalization methods that perform well on the new data. However, those studies only consider distribution shifts that affect {it node-} and {it graph-level} tasks, thus ignoring link-level tasks. Furthermore, relatively few LP generalization methods exist. To bridge this gap, we introduce a set of LP-specific data splits which utilizes structural properties to induce a controlled distribution shift. We verify the shift's effect empirically through evaluation of different SOTA LP methods and subsequently couple these methods with generalization techniques. Interestingly, LP-specific methods frequently generalize poorly relative to heuristics or basic GNN methods. Finally, this work provides analysis to uncover insights for enhancing LP generalization. Our code is available at: href{https://github.com/revolins/LPStructGen}{https://github.com/revolins/LPStructGen}