🤖 AI Summary
This work addresses the high token consumption and contextual overhead caused by free-form natural language communication in multi-agent systems, which impede performance and reasoning efficiency. It introduces PACT, a novel action-state–oriented structured communication protocol that models agent communication as a shared state update problem. PACT extracts and compresses agents’ raw outputs into compact action-state records, which are written to a shared history and dynamically adapt to varying system topologies. Evaluated on benchmarks including OpenHands and SWE-agent, PACT achieves a superior performance–cost trade-off: it improves task success rates in OpenHands while reducing per-task token usage by 10%, and in SWE-agent maintains equivalent success rates with half the input tokens.
📝 Abstract
Multi-agent systems (MAS) built on large language models are typically organized around roles, pipelines, and turn schedules, while the content that agents pass to one another is often left as unconstrained natural language. However, this free-form communication can rapidly inflate token usage, consume the shared context window, and ultimately affect both system performance and inference cost. We analyze five common inter-agent communication strategies across two MAS topologies, finding that no fixed strategy is universally optimal. Instead, effective inter-agent messages consistently preserve action-centered information needed by downstream agents. Building on this, we propose the PACT (Protocolized Action-state Communication and Transmission), which treats inter-agent communication as a public state-update problem and projects each raw agent output into a compact action-state record before it enters shared history. Across different MAS topologies, PACT consistently improves the performance-cost trade-off, achieving comparable or stronger task performance with substantially fewer tokens. The gains extend to production coding harnesses: PACT lifts OpenHands' resolve rate at -10% tokens-per-resolved, and is resolve-neutral on SWE-agent while halving input tokens. Our code is publicly available at https://github.com/iNLP-Lab/PACT.