ToolACE: Winning the Points of LLM Function Calling

📅 2024-09-02
🏛️ International Conference on Learning Representations
📈 Citations: 48
Influential: 9
📄 PDF
🤖 AI Summary
To address the scarcity of high-quality, diverse function-calling training data and the limited coverage and low accuracy of existing synthetic approaches, this paper proposes a self-evolving multi-agent data synthesis framework. First, it introduces an API self-evolution mining mechanism to construct a highly comprehensive toolset covering 26,507 APIs. Second, it integrates multi-agent collaborative dialogue with formal reasoning guidance to generate instruction-call pairs exhibiting high complexity and diversity. Third, it employs a dual-layer verification scheme—combining rule-based checks and large language model (LLM) evaluation—to ensure both semantic correctness and executable accuracy. Remarkably, using only an 8B-parameter model, our method achieves state-of-the-art performance on the Berkeley Function-Calling Benchmark, matching GPT-4’s accuracy. The code, model weights, and a subset of synthesized data are publicly released on Hugging Face.

Technology Category

Application Category

📝 Abstract
Function calling significantly extends the application boundary of large language models, where high-quality and diverse training data is critical for unlocking this capability. However, real function-calling data is quite challenging to collect and annotate, while synthetic data generated by existing pipelines tends to lack coverage and accuracy. In this paper, we present ToolACE, an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data. ToolACE leverages a novel self-evolution synthesis process to curate a comprehensive API pool of 26,507 diverse APIs. Dialogs are further generated through the interplay among multiple agents, guided by a formalized thinking process. To ensure data accuracy, we implement a dual-layer verification system combining rule-based and model-based checks. We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard, rivaling the latest GPT-4 models. Our model and a subset of the data are publicly available at https://huggingface.co/Team-ACE.
Problem

Research questions and friction points this paper is trying to address.

Generating high-quality diverse function-calling training data
Overcoming limitations of synthetic data coverage accuracy
Enabling LLMs to achieve state-of-the-art function-calling performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-evolution synthesis for API pool
Multi-agent dialog generation
Dual-layer verification system
🔎 Similar Papers
Weiwen Liu
Weiwen Liu
Associate Professor, Shanghai Jiao Tong University
large language modelsAI agentsrecommender systems
X
Xu Huang
University of Science and Technology of China
Xingshan Zeng
Xingshan Zeng
Huawei Noah's Ark Lab
Natural Language ProcessingSpeech TranslationLarge Language Models
X
Xinlong Hao
Huawei Noah’s Ark Lab
Shuai Yu
Shuai Yu
Huawei Noah’s Ark Lab
Dexun Li
Dexun Li
Singapore Management University
Reinforcement LearningResource OptimisationRecommendation System
S
Shuai Wang
Huawei Noah’s Ark Lab
Weinan Gan
Weinan Gan
Huawei Noah's Ark Lab
Large Language ModelGenerative IRAgent
Z
Zhengying Liu
Huawei Noah’s Ark Lab
Y
Yuanqing Yu
Tsinghua University
Zezhong Wang
Zezhong Wang
Institute of Science Tokyo
VLSI physical design
Y
Yuxian Wang
Huawei Technologies Co., Ltd
W
Wu Ning
Huawei Technologies Co., Ltd
Yutai Hou
Yutai Hou
Huawei
LLMNLPDialogueAlignmentMeta Learning
B
Bin Wang
Huawei Noah’s Ark Lab
Chuhan Wu
Chuhan Wu
WeChat AI, Tencent
Foundation ModelPretrainingPost TrainingLLM Agent
X
Xinzhi Wang
Huawei Noah’s Ark Lab
Y
Yong Liu
Huawei Noah’s Ark Lab
Yasheng Wang
Yasheng Wang
Tencent
Natural Language Processing
Duyu Tang
Duyu Tang
Huawei
Natural Language Processing
D
Dandan Tu
Huawei Technologies Co., Ltd
Lifeng Shang
Lifeng Shang
Huawei Noah's Ark Lab
Machine LearningComputer VisionPattern ReconitionNatural Language Processing
X
Xin Jiang
Huawei Noah’s Ark Lab
R
Ruiming Tang
Huawei Noah’s Ark Lab
D
Defu Lian
University of Science and Technology of China
Q
Qun Liu
Huawei Noah’s Ark Lab
Enhong Chen
Enhong Chen
University of Science and Technology of China
data miningrecommender systemmachine learning