Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents

📅 2025-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address critical challenges—including low reliability, loose control, and poor interpretability—in LLM-driven scientific experiment automation, this paper proposes Curie, an AI agent framework. Methodologically, Curie introduces a novel collaborative architecture integrating *internal/external rigor modules* with an *experimental knowledge module*, ensuring end-to-end rigorous execution; constructs the first scientific experiment benchmark comprising 46 tasks derived from real publications and open-source projects; and synergistically combines multi-agent coordination, structured experimental planning, causal-reasoning guidance, knowledge-graph-enhanced retrieval, and LLM self-verification. Empirically, Curie achieves a 3.4× improvement in accuracy over the strongest baseline on experimental question answering. All code is publicly released.

Technology Category

Application Category

📝 Abstract
Scientific experimentation, a cornerstone of human progress, demands rigor in reliability, methodical control, and interpretability to yield meaningful results. Despite the growing capabilities of large language models (LLMs) in automating different aspects of the scientific process, automating rigorous experimentation remains a significant challenge. To address this gap, we propose Curie, an AI agent framework designed to embed rigor into the experimentation process through three key components: an intra-agent rigor module to enhance reliability, an inter-agent rigor module to maintain methodical control, and an experiment knowledge module to enhance interpretability. To evaluate Curie, we design a novel experimental benchmark composed of 46 questions across four computer science domains, derived from influential research papers, and widely adopted open-source projects. Compared to the strongest baseline tested, we achieve a 3.4$ imes$ improvement in correctly answering experimental questions.Curie is open-sourced at https://github.com/Just-Curieous/Curie.
Problem

Research questions and friction points this paper is trying to address.

Automating rigorous scientific experimentation
Enhancing reliability and interpretability in experiments
Developing AI agents for methodical control
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI agent framework Curie
intra-agent rigor module
experiment knowledge module
🔎 Similar Papers
No similar papers found.
Patrick Tser Jern Kon
Patrick Tser Jern Kon
University of Michigan
MetascienceAI for ScienceSystemsSecurityNetworking
J
Jiachen Liu
Department of Computer Science and Engineering, University of Michigan
Q
Qiuyi Ding
Department of Computer Science and Engineering, University of Michigan
Y
Yiming Qiu
Department of Computer Science and Engineering, University of Michigan
Z
Zhenning Yang
Department of Computer Science and Engineering, University of Michigan
Yibo Huang
Yibo Huang
University of Michigan
RDMAData Center NetworkDistributed SystemCloud ComputingOperation System
Jayanth Srinivasa
Jayanth Srinivasa
Cisco Research
Machine LearningNatural Language UnderstandingFederated Learning
Myungjin Lee
Myungjin Lee
Cisco Systems
NetworkingSystems
Mosharaf Chowdhury
Mosharaf Chowdhury
University of Michigan
Cloud ComputingMachine Learning SystemsNetworkingDistributed SystemsEnergy Efficiency
A
Ang Chen
Department of Computer Science and Engineering, University of Michigan