Beyond Static Evaluation: Co-Evolutionary Mechanisms for LLM-Driven Strategy Evolution in Adversarial Games

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of strategy evolution stagnation in adversarial multi-agent environments, where dynamic environmental changes often render policy evaluation unreliable. To overcome this, the authors propose the FAMOU framework, which introduces coevolutionary mechanisms into large language model (LLM)-driven, code-level strategy evolution for the first time. FAMOU integrates hierarchical deep evaluation, a weakness-stress mechanism, and dynamic opponent weighting to enable sustained optimization on the 3v3 maritime capture-the-flag simulation platform MCTF 2026. The approach successfully generates novel tactical structures—such as lookahead search and adaptive interception—and demonstrates effective transfer to physical hardware. Experimental results show that FAMOU achieves a state-of-the-art composite score of 0.526 and a 61.7% win rate against unseen opponents, earning first place in the hardware track and third in the simulation track at the AAMAS 2026 competition.
📝 Abstract
Recent advances in LLM-driven code evolution have enabled automated discovery by iteratively generating and improving programs. However, applying these methods to adversarial multi-agent games introduces a fundamental challenge: the evaluation landscape shifts as strategies improve, causing fixed evaluators to become unreliable and evolution to stagnate. We propose three mechanisms to address this challenge: evaluator co-evolution, which incorporates discovered champions into the opponent pool; hierarchical deep evaluation, which replaces noisy few-game scores with statistically reliable assessments; and weakness pressure, which dynamically up-weights the most difficult opponents to break through plateaus. We implement these mechanisms within FAMOU, a framework built upon the same foundation-model code-evolution paradigm as OpenEvolve and ShinkaEvolve. On the MCTF 2026 3v3 maritime capture-the-flag task, FAMOU consistently outperforms both baselines under two backbone LLMs, achieving the highest combined score (0.526) and the best generalization to unseen opponents (61.7% win rate), while ablations confirm that each mechanism contributes to performance. Notably, the LLM mutation process generates tactical structures entirely absent from the seed strategies -- including lookahead search and adaptive interception -- demonstrating that code-level evolution can produce nontrivial algorithmic innovations in adversarial settings. The FAMOU-evolved strategy further achieved 1st place in the hardware round-robin and 3rd in simulation at the AAMAS 2026 MCTF Competition, validating its real-world transferability. The optimized implementation and corresponding evaluation codes developed through our evolutionary process are available at: https://github.com/1xiangliu1/FAMOU-CoEvo
Problem

Research questions and friction points this paper is trying to address.

adversarial games
LLM-driven strategy evolution
dynamic evaluation landscape
evolution stagnation
multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

co-evolution
LLM-driven code evolution
adversarial games
hierarchical evaluation
weakness pressure
Haoran Li
Haoran Li
Institute of Automation,Chinese Academy of Sciences
Artificial IntelligenceRoboticsReinforcement LearningEmbodied Intelligence
Z
Zengle Ge
Famou Agent Team, Baidu AI Cloud
Z
Ziyang Zhang
Famou Agent Team, Baidu AI Cloud
X
Xiaomin Yuan
Famou Agent Team, Baidu AI Cloud
Y
Yui Lo
The University of Sydney, Australia
Q
Qianhui Liu
School of Artificial Intelligence, University of Chinese Academy of Sciences
B
Bocheng An
School of Transportation, Southeast University
D
Dongke Rong
Famou Agent Team, Baidu AI Cloud
J
Jiaqun Liu
School of Software and Microelectronics, Peking University
A
Annan Li
Famou Agent Team, Baidu AI Cloud
J
Jianmin Wu
Famou Agent Team, Baidu AI Cloud
Dawei Yin
Dawei Yin
Senior Director, Head of Search Science at Baidu
Machine LearningWeb MiningData Mining
Dou Shen
Dou Shen
Baidu Inc
Data MiningMachine LearningOnline Advertising