FOS: A Large-Scale Temporal Graph Benchmark for Scientific Interdisciplinary Link Prediction

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the prediction of unexpectedly emerging interdisciplinary research directions—a critical challenge in science-of-science. Method: We introduce FOS, the first large-scale temporal graph benchmark for scientific frontier forecasting (1827–2024), comprising 65,027 subfields and their annual co-occurrence relationships, with a focus on predicting “first-time interdisciplinary associations.” Our approach employs a timestamped edge-based temporal graph modeling framework that jointly encodes long-text semantic embeddings of nodes and temporal topological features of edges, integrated with state-of-the-art temporal graph neural networks and dynamic negative sampling. Contribution/Results: Experiments demonstrate that semantic embeddings substantially improve prediction accuracy; ensemble models exhibit complementary strengths; and top-ranked predictions strongly align with subsequently observed real-world interdisciplinary developments. This work establishes a novel, reproducible paradigm and benchmark for scientific frontier detection.

Technology Category

Application Category

📝 Abstract

Interdisciplinary scientific breakthroughs mostly emerge unexpectedly, and forecasting the formation of novel research fields remains a major challenge. We introduce FOS (Future Of Science), a comprehensive time-aware graph-based benchmark that reconstructs annual co-occurrence graphs of 65,027 research sub-fields (spanning 19 general domains) over the period 1827-2024. In these graphs, edges denote the co-occurrence of two fields in a single publication and are timestamped with the corresponding publication year. Nodes are enriched with semantic embeddings, and edges are characterized by temporal and topological descriptors. We formulate the prediction of new field-pair linkages as a temporal link-prediction task, emphasizing the "first-time" connections that signify pioneering interdisciplinary directions. Through extensive experiments, we evaluate a suite of state-of-the-art temporal graph architectures under multiple negative-sampling regimes and show that (i) embedding long-form textual descriptions of fields significantly boosts prediction accuracy, and (ii) distinct model classes excel under different evaluation settings. Case analyses show that top-ranked link predictions on FOS align with field pairings that emerge in subsequent years of academic publications. We publicly release FOS, along with its temporal data splits and evaluation code, to establish a reproducible benchmark for advancing research in predicting scientific frontiers.

Problem

Research questions and friction points this paper is trying to address.

Predicting first-time interdisciplinary connections between scientific fields

Forecasting novel research field formation using temporal graph data

Evaluating temporal link prediction models for scientific collaboration patterns

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses temporal graphs for interdisciplinary link prediction

Enriches nodes with semantic embeddings for accuracy

Evaluates models under multiple negative-sampling regimes

🔎 Similar Papers

Forecasting high-impact research topics via machine learning on evolving knowledge graphs

2024-02-13arXiv.orgCitations: 4

Authors to Follow