π€ AI Summary
Current large language modelβbased autonomous scientific research systems rely on online real-time reasoning, suffering from high computational costs, limited context windows, fragile inference, and frequent hallucinations. This work proposes a precomputation-driven framework that shifts scientific understanding from online reasoning to offline structured knowledge construction. By extracting methodological units and building a scientific knowledge graph, the approach aligns user research intent with established paradigms. It pioneers a paradigm shift in research automation by moving the core computational burden to the offline phase, thereby substantially alleviating the context bottleneck of large language models. In end-to-end experiments, the method generates multiple coherent, methodologically sound, and novel high-quality research proposals, significantly enhancing the reliability, reusability, and efficiency of AI-assisted scientific discovery.
π Abstract
Autonomous scientific discovery with large language model (LLM)-based agents has recently made substantial progress, demonstrating the ability to automate end-to-end research workflows. However, existing systems largely rely on runtime-centric execution paradigms, repeatedly reading, summarizing, and reasoning over large volumes of scientific literature online. This on-the-spot computation strategy incurs high computational cost, suffers from context window limitations, and often leads to brittle reasoning and hallucination. We propose Idea2Story, a pre-computation-driven framework for autonomous scientific discovery that shifts literature understanding from online reasoning to offline knowledge construction. Idea2Story continuously collects peer-reviewed papers together with their review feedback, extracts core methodological units, composes reusable research patterns, and organizes them into a structured methodological knowledge graph. At runtime, underspecified user research intents are aligned to established research paradigms, enabling efficient retrieval and reuse of high-quality research patterns instead of open-ended generation and trial-and-error. By grounding research planning and execution in a pre-built knowledge graph, Idea2Story alleviates the context window bottleneck of LLMs and substantially reduces repeated runtime reasoning over literature. We conduct qualitative analyses and preliminary empirical studies demonstrating that Idea2Story can generate coherent, methodologically grounded, and novel research patterns, and can produce several high-quality research demonstrations in an end-to-end setting. These results suggest that offline knowledge construction provides a practical and scalable foundation for reliable autonomous scientific discovery.