Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

📅 2026-01-28

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Current large language model–based autonomous scientific research systems rely on online real-time reasoning, suffering from high computational costs, limited context windows, fragile inference, and frequent hallucinations. This work proposes a precomputation-driven framework that shifts scientific understanding from online reasoning to offline structured knowledge construction. By extracting methodological units and building a scientific knowledge graph, the approach aligns user research intent with established paradigms. It pioneers a paradigm shift in research automation by moving the core computational burden to the offline phase, thereby substantially alleviating the context bottleneck of large language models. In end-to-end experiments, the method generates multiple coherent, methodologically sound, and novel high-quality research proposals, significantly enhancing the reliability, reusability, and efficiency of AI-assisted scientific discovery.

Technology Category

Application Category

📝 Abstract

Autonomous scientific discovery with large language model (LLM)-based agents has recently made substantial progress, demonstrating the ability to automate end-to-end research workflows. However, existing systems largely rely on runtime-centric execution paradigms, repeatedly reading, summarizing, and reasoning over large volumes of scientific literature online. This on-the-spot computation strategy incurs high computational cost, suffers from context window limitations, and often leads to brittle reasoning and hallucination. We propose Idea2Story, a pre-computation-driven framework for autonomous scientific discovery that shifts literature understanding from online reasoning to offline knowledge construction. Idea2Story continuously collects peer-reviewed papers together with their review feedback, extracts core methodological units, composes reusable research patterns, and organizes them into a structured methodological knowledge graph. At runtime, underspecified user research intents are aligned to established research paradigms, enabling efficient retrieval and reuse of high-quality research patterns instead of open-ended generation and trial-and-error. By grounding research planning and execution in a pre-built knowledge graph, Idea2Story alleviates the context window bottleneck of LLMs and substantially reduces repeated runtime reasoning over literature. We conduct qualitative analyses and preliminary empirical studies demonstrating that Idea2Story can generate coherent, methodologically grounded, and novel research patterns, and can produce several high-quality research demonstrations in an end-to-end setting. These results suggest that offline knowledge construction provides a practical and scalable foundation for reliable autonomous scientific discovery.

Problem

Research questions and friction points this paper is trying to address.

autonomous scientific discovery

large language models

runtime reasoning

context window limitation

scientific literature processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

pre-computation-driven framework

methodological knowledge graph

autonomous scientific discovery