Multi-Agent Craftax: Benchmarking Open-Ended Multi-Agent Reinforcement Learning at the Hyperscale

📅 2025-11-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing MARL benchmarks predominantly focus on short-horizon, homogeneous tasks, limiting rigorous evaluation of long-horizon dependency modeling, complex coordination, and generalization across diverse agent roles and tasks. Method: We introduce Craftax-MA and its collaborative extension Craftax-Coop—the first high-performance, scalable open-source multi-agent environment implemented in JAX. It supports heterogeneous agents, explicit trading mechanisms, and high-dimensional state spaces. Crucially, it unifies long-horizon evaluation (>1000 steps), structured cooperation, and cross-task generalization within a single testbed, backed by efficient parallel simulation (training completes in <1 hour; total interactions reach 250 million). Contribution/Results: Empirical evaluation reveals fundamental limitations of mainstream MARL algorithms in credit assignment, exploration efficiency, and cooperative stability. Craftax-MA thus establishes a new, rigorous benchmark for assessing foundational MARL capabilities—particularly in long-term, cooperative, and generalizable settings.

Technology Category

Application Category

📝 Abstract
Progress in multi-agent reinforcement learning (MARL) requires challenging benchmarks that assess the limits of current methods. However, existing benchmarks often target narrow short-horizon challenges that do not adequately stress the long-term dependencies and generalization capabilities inherent in many multi-agent systems. To address this, we first present extit{Craftax-MA}: an extension of the popular open-ended RL environment, Craftax, that supports multiple agents and evaluates a wide range of general abilities within a single environment. Written in JAX, extit{Craftax-MA} is exceptionally fast with a training run using 250 million environment interactions completing in under an hour. To provide a more compelling challenge for MARL, we also present extit{Craftax-Coop}, an extension introducing heterogeneous agents, trading and more mechanics that require complex cooperation among agents for success. We provide analysis demonstrating that existing algorithms struggle with key challenges in this benchmark, including long-horizon credit assignment, exploration and cooperation, and argue for its potential to drive long-term research in MARL.
Problem

Research questions and friction points this paper is trying to address.

Benchmarks lack long-term dependencies and generalization in multi-agent systems
Existing MARL algorithms struggle with credit assignment and exploration
Proposing Craftax-MA and Craftax-Coop to test complex multi-agent cooperation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends Craftax to multi-agent reinforcement learning
Uses JAX for exceptionally fast training speeds
Introduces heterogeneous agents and trading mechanics
🔎 Similar Papers
No similar papers found.