Scaling Agentic Capabilities via Grounded Interaction Synthesis

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the limitations of existing large language model (LLM)-based methods for synthesizing agent interaction data, which often suffer from insufficient realism and diversity, thereby hindering their applicability to high-fidelity, long-horizon complex tasks. The paper proposes GAIS, a novel framework that uniquely integrates real-world Model Context Protocol (MCP) server environments with structured task planning. By anchoring interactions in authentic protocol contexts and incorporating logical dependency constraints, structure-guided planning, and adversarial strategy generation, GAIS produces diverse and complex tasks while mitigating the data bias inherent in purely LLM-generated approaches. Empirical evaluations on BFCL, τ²-Bench, and ACEBench demonstrate that GAIS significantly outperforms current methods, enabling base models to achieve performance comparable to—or even exceeding—that of officially instruction-tuned variants, all while offering superior data efficiency and scalability.

📝 Abstract

General agentic intelligence hinges on the ability to interact with diverse real-world tools to complete complex tasks, a capability fundamentally tied to the quality of interaction data. To bypass the prohibitive costs of human annotation, prevailing paradigms depend entirely on Large Language Models (LLMs) to scale the synthesis of agentic environments and tasks. However, such unconstrained generation often degenerates into biased random sampling of LLMs' internal priors, failing to capture the diversity and difficulty of real-world domains or construct high-fidelity, long-horizon tasks. In this work, we introduce Grounded Agentic Interaction Synthesis (GAIS), a framework that automates the scalable construction of diverse environments and complex tasks via a two-phase grounding mechanism. Specifically, we construct protocol-anchored environments derived from real-world Model Context Protocol (MCP) servers to ensure functional diversity and difficulty. Subsequently, we employ structure-guided planning to navigate these environments, actively enforcing logical dependencies and adversarial policies to generate complex tasks. Experiments on BFCL, $τ^2$-Bench, and ACEBench demonstrate that GAIS-synthesized data significantly outperforms state-of-the-art baselines, enabling base models to match or even surpass their official instruction-tuned counterparts. Furthermore, GAIS exhibits superior data efficiency and scalability, achieving exceptional capabilities with significantly less data while maintaining continuous growth where baselines stagnate. Our code and dataset are publicly available at https://github.com/Eric8932/GAIS.

Problem

Research questions and friction points this paper is trying to address.

agentic intelligence

interaction data synthesis

real-world grounding

task complexity

data diversity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Grounded Agentic Interaction Synthesis

Model Context Protocol

structure-guided planning