Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

πŸ“… 2026-01-28
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Current large language model–based autonomous scientific research systems rely on online real-time reasoning, suffering from high computational costs, limited context windows, fragile inference, and frequent hallucinations. This work proposes a precomputation-driven framework that shifts scientific understanding from online reasoning to offline structured knowledge construction. By extracting methodological units and building a scientific knowledge graph, the approach aligns user research intent with established paradigms. It pioneers a paradigm shift in research automation by moving the core computational burden to the offline phase, thereby substantially alleviating the context bottleneck of large language models. In end-to-end experiments, the method generates multiple coherent, methodologically sound, and novel high-quality research proposals, significantly enhancing the reliability, reusability, and efficiency of AI-assisted scientific discovery.

Technology Category

Application Category

πŸ“ Abstract
Autonomous scientific discovery with large language model (LLM)-based agents has recently made substantial progress, demonstrating the ability to automate end-to-end research workflows. However, existing systems largely rely on runtime-centric execution paradigms, repeatedly reading, summarizing, and reasoning over large volumes of scientific literature online. This on-the-spot computation strategy incurs high computational cost, suffers from context window limitations, and often leads to brittle reasoning and hallucination. We propose Idea2Story, a pre-computation-driven framework for autonomous scientific discovery that shifts literature understanding from online reasoning to offline knowledge construction. Idea2Story continuously collects peer-reviewed papers together with their review feedback, extracts core methodological units, composes reusable research patterns, and organizes them into a structured methodological knowledge graph. At runtime, underspecified user research intents are aligned to established research paradigms, enabling efficient retrieval and reuse of high-quality research patterns instead of open-ended generation and trial-and-error. By grounding research planning and execution in a pre-built knowledge graph, Idea2Story alleviates the context window bottleneck of LLMs and substantially reduces repeated runtime reasoning over literature. We conduct qualitative analyses and preliminary empirical studies demonstrating that Idea2Story can generate coherent, methodologically grounded, and novel research patterns, and can produce several high-quality research demonstrations in an end-to-end setting. These results suggest that offline knowledge construction provides a practical and scalable foundation for reliable autonomous scientific discovery.
Problem

Research questions and friction points this paper is trying to address.

autonomous scientific discovery
large language models
runtime reasoning
context window limitation
scientific literature processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

pre-computation-driven framework
methodological knowledge graph
autonomous scientific discovery
research pattern reuse
LLM context bottleneck
πŸ”Ž Similar Papers
No similar papers found.
T
Tengyue Xu
Z
Zhuoyang Qian
G
Gaoge Liu
Li Ling
Li Ling
KTH - Royal Institute of Technology
computer visiondeep learningroboticsautonomous navigation
Z
Zhentao Zhang
B
Biao Wu
S
Shuo Zhang
Ke Lu
Ke Lu
Duke University
W
Wei Shi
Z
Ziqi Wang
Z
Zheng Feng
Y
Yan Luo
S
Shu Xu
Y
Yongjin Chen
Z
Zhibo Feng
Z
Zhuo Chen
B
Bruce Yuan
H
Harry Wang
K
Kris Chen