FOS: A Large-Scale Temporal Graph Benchmark for Scientific Interdisciplinary Link Prediction

📅 2025-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the prediction of unexpectedly emerging interdisciplinary research directions—a critical challenge in science-of-science. Method: We introduce FOS, the first large-scale temporal graph benchmark for scientific frontier forecasting (1827–2024), comprising 65,027 subfields and their annual co-occurrence relationships, with a focus on predicting “first-time interdisciplinary associations.” Our approach employs a timestamped edge-based temporal graph modeling framework that jointly encodes long-text semantic embeddings of nodes and temporal topological features of edges, integrated with state-of-the-art temporal graph neural networks and dynamic negative sampling. Contribution/Results: Experiments demonstrate that semantic embeddings substantially improve prediction accuracy; ensemble models exhibit complementary strengths; and top-ranked predictions strongly align with subsequently observed real-world interdisciplinary developments. This work establishes a novel, reproducible paradigm and benchmark for scientific frontier detection.

Technology Category

Application Category

📝 Abstract
Interdisciplinary scientific breakthroughs mostly emerge unexpectedly, and forecasting the formation of novel research fields remains a major challenge. We introduce FOS (Future Of Science), a comprehensive time-aware graph-based benchmark that reconstructs annual co-occurrence graphs of 65,027 research sub-fields (spanning 19 general domains) over the period 1827-2024. In these graphs, edges denote the co-occurrence of two fields in a single publication and are timestamped with the corresponding publication year. Nodes are enriched with semantic embeddings, and edges are characterized by temporal and topological descriptors. We formulate the prediction of new field-pair linkages as a temporal link-prediction task, emphasizing the "first-time" connections that signify pioneering interdisciplinary directions. Through extensive experiments, we evaluate a suite of state-of-the-art temporal graph architectures under multiple negative-sampling regimes and show that (i) embedding long-form textual descriptions of fields significantly boosts prediction accuracy, and (ii) distinct model classes excel under different evaluation settings. Case analyses show that top-ranked link predictions on FOS align with field pairings that emerge in subsequent years of academic publications. We publicly release FOS, along with its temporal data splits and evaluation code, to establish a reproducible benchmark for advancing research in predicting scientific frontiers.
Problem

Research questions and friction points this paper is trying to address.

Predicting first-time interdisciplinary connections between scientific fields
Forecasting novel research field formation using temporal graph data
Evaluating temporal link prediction models for scientific collaboration patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses temporal graphs for interdisciplinary link prediction
Enriches nodes with semantic embeddings for accuracy
Evaluates models under multiple negative-sampling regimes
Kiyan Rezaee
Kiyan Rezaee
Student at Guilan University
Information retrievalNatural Language Processing
Morteza Ziabakhsh
Morteza Ziabakhsh
Bachelor’s Graduate, University of Guilan
deep learningmachine learningsoftware engineering
N
Niloofar Nikfarjam
Department of Computer Science, University of Guilan
M
Mohammad M. Ghassemi
Department of Computer Science and Engineering, Michigan State University
Y
Yazdan Rezaee Jouryabi
Institute of Medical Science and Technology, Shahid Beheshti University
S
Sadegh Eskandari
Department of Computer Science, University of Guilan
Reza Lashgari
Reza Lashgari
Institute of Medical Science and Technology, Shahid Beheshti University