SynthTools: A Framework for Scaling Synthetic Tools for Agent Development

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Real-world APIs suffer from limited availability, narrow domain coverage, and poor stability—e.g., key dependency and rate limiting—hindering large-scale training and robust evaluation of AI agents. To address this, we propose the first scalable synthetic tool ecosystem framework, comprising three tightly integrated modules: (1) automated tool generation across diverse domains; (2) high-fidelity behavioral simulation achieving 94% accuracy; and (3) formal consistency auditing ensuring interface reliability, with 99% audit accuracy. Our framework doubles both domain breadth and tool density compared to prior work, and introduces challenging downstream tasks that significantly improve evaluation stability and training scalability. This establishes a reliable, reproducible infrastructure for advancing research on AI agent tool-use capabilities.

Technology Category

Application Category

📝 Abstract

AI agents increasingly rely on external tools to solve complex, long-horizon tasks. Advancing such agents requires reproducible evaluation and large-scale training in controllable, diverse, and realistic tool-use environments. However, real-world APIs are limited in availability, domain coverage, and stability, often requiring access keys and imposing rate limits, which render them impractical for stable evaluation or scalable training. To address these challenges, we introduce SynthTools, a flexible and scalable framework for generating synthetic tool ecosystems. Our framework consists of three core components: Tool Generation for automatic and scalable creation of diverse tools, Tool Simulation to emulate realistic tool behaviors, and Tool Audit to ensure correctness and consistency of tool simulation. To illustrate its scalability, we show that SynthTools can readily produce toolsets that span twice as many domains and twice as many tools per domain as prior work. Furthermore, the tool simulation and tool audit components demonstrate strong reliability, achieving $94%$ and $99%$ accuracy respectively. Finally, we construct downstream tasks from the generated tools that even state-of-the-art models struggle to complete. By enabling scalable, diverse, and reliable tool ecosystems, SynthTools provides a practical path toward large-scale training and stable evaluation of tool-use agents. Our code is available at https://github.com/namkoong-lab/SynthTools.

Problem

Research questions and friction points this paper is trying to address.

Addressing limitations of real-world APIs for AI agent training

Creating scalable synthetic tool ecosystems for reliable evaluation

Enabling diverse tool environments for stable agent development

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates synthetic tool ecosystems automatically

Simulates realistic tool behaviors with high accuracy

Ensures correctness and consistency through tool audit

🔎 Similar Papers

Foragax: An Agent-Based Modelling Framework Based on JAX