SMAC-Talk: A Natural Language Extension of the StarCraft Multi-Agent Challenge for Large Language Models

πŸ“… 2026-06-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

212K/year
πŸ€– AI Summary
This work addresses the limited capability of large language models (LLMs) to communicate and coordinate via natural language in cooperative multi-agent settings by introducing SMAC-Talkβ€”the first extension of the StarCraft Multi-Agent Challenge that supports natural language interaction, tailored for partially observable, decentralized, and long-horizon decision-making scenarios. We incorporate a natural language communication channel to evaluate coordination and trust among LLM-based agents and propose an adversarial evaluation setup featuring deceptive communicators. Building upon the Qwen3.5 model family, we develop three benchmark agents integrating memory mechanisms and distinct reasoning architectures, systematically analyzing the impact of model scale, memory, and reasoning on collaborative performance. SMAC-Talk is publicly released to establish a new benchmark for multi-agent LLM research.
πŸ“ Abstract
As LLMs become more widely deployed, they are increasingly expected to work alongside other AI agents rather than operating in isolation. Effective coordination in these settings requires agents to communicate, share information and make decisions under uncertainty. We introduce SMAC-Talk, a natural language extension of the StarCraft Multi-Agent Challenge for evaluating LLM-based agents in cooperative multi-agent environments. The environment has several key features such as decentralized control, partial observability and long-horizon decision making. SMAC-Talk includes a natural language communication channel which is used to probe agent coordination and trust. We use this communication channel to construct different evaluation scenarios, including settings with an embedded deceptive communicator that tries to disrupt and deceive allies through communication alone. We provide three agents for benchmarking using 4 models from the Qwen3.5 family and study how reasoning structure, memory and model scale affect coordination between agents. We release SMAC-Talk as an open benchmark to support the research community in developing and evaluating LLM agents in cooperative multi-agent settings.
Problem

Research questions and friction points this paper is trying to address.

multi-agent coordination
natural language communication
large language models
deceptive communication
cooperative environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

natural language communication
multi-agent coordination
deceptive communication
large language models
cooperative benchmarking
πŸ”Ž Similar Papers
No similar papers found.