Fantastic Scientific Agents and How to Build Them: AgentBuild for Rietveld Refinement

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitations of current large language model–driven scientific agent development, which often neglects expert scientific judgment and lacks maintainable, reproducible construction mechanisms. The authors propose AgentBuild, a novel framework that treats agent construction as a distinct workflow phase and introduces a contract-based paradigm: scientists explicitly define scoring rubrics, difficulty curricula, and knowledge bases via contracts to guide a meta-optimizer in automatically building and iteratively refining agents. This approach supports versioned management and enables re-tuning without full reconstruction as underlying models evolve. Integrating rubric-driven evaluation, meta-optimization–encoded agents, and MCP/A2A architectures, the framework incorporates GSAS-II for X-ray diffraction Rietveld refinement and successfully handles a challenging four-hour LLZO noisy-gradient scan, demonstrating the contract’s capacity to jointly assess fitting reliability and task trajectory scope.

📝 Abstract

As scientific workflows shift from deterministic executables to LLM-based agents, the development practices on offer, such as fine-tuning, reinforcement learning, and prompt-and-go, bury the scientist's judgment. We propose treating agent construction as a workflow stage and introduce AgentBuild, which builds a scientific agent from a contract the scientist authors. The contract is a version-controlled rubric, a difficulty-graded curriculum, and a curated external knowledge base. A rubric-driven judge gates a meta-optimizer coding agent that edits the agent within a declared boundary, so the build compiles the agent, not the scientist's judgment. We instantiate this for Rietveld refinement of X-ray diffraction data through GSAS-II behind MCP and A2A, where a blank-harness construction run progresses through a lithium lanthanum zirconium oxide (LLZO) signal-to-noise ladder, reaches the 4 hour scan as a frontier case, and exposes the workflow-scope limits that remain. The same rubric that rewards credible fits also scores trajectory scope, making the frontier a contract failure rather than a pattern-fitting failure. As base models evolve, re-running AgentBuild is a re-tune, not a rebuild, and the scientist's authored contract remains the durable asset.

Problem

Research questions and friction points this paper is trying to address.

scientific agents

LLM-based agents

scientist judgment

agent construction

Rietveld refinement

Innovation

Methods, ideas, or system contributions that make the work stand out.

AgentBuild

scientific agents

contract-driven development

Rietveld refinement

rubric-based optimization

🔎 Similar Papers

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement

2024-10-06arXiv.orgCitations: 1

A Role of Environmental Complexity on Representation Learning in Deep Reinforcement Learning Agents

2024-07-03arXiv.orgCitations: 1

Foragax: An Agent-Based Modelling Framework Based on JAX

2024-09-10arXiv.orgCitations: 1