🤖 AI Summary
To address collaborative task planning for heterogeneous mobile robot swarms under communication constraints and limited onboard computation, this paper proposes the CMacDec-POMDP modeling framework and an asynchronous “centralized training–decentralized execution” paradigm based on Multi-Agent Transformer (MAT). The method integrates driver-aware macro-action modeling with decentralized partially observable Markov decision processes (Dec-POMDPs), enabling generalizable deployment under dynamic networking, unknown swarm size, and variable composition. In 2D grid simulations, it significantly outperforms conventional planners: task completion rate and robustness improve markedly—performance degradation remains below 12% under communication failures, and linear convergence is preserved even with ≥50 robots. Moreover, the policy exhibits cross-environment transferability and adaptability to varying swarm scales. The core contributions are the first macro-action-augmented Dec-POMDP formulation and an asynchronous MAT training mechanism, jointly ensuring expressive modeling power, distributed feasibility, and practical deployability.
📝 Abstract
Cooperative mission planning for heterogeneous teams of mobile robots presents a unique set of challenges, particularly when operating under communication constraints and limited computational resources. To address these challenges, we propose the Cooperative and Asynchronous Transformer-based Mission Planning (CATMiP) framework, which leverages multi-agent reinforcement learning (MARL) to coordinate distributed decision making among agents with diverse sensing, motion, and actuation capabilities, operating under sporadic ad hoc communication. A Class-based Macro-Action Decentralized Partially Observable Markov Decision Process (CMacDec-POMDP) is also formulated to effectively model asynchronous decision-making for heterogeneous teams of agents. The framework utilizes an asynchronous centralized training and distributed execution scheme that is developed based on the Multi-Agent Transformer (MAT) architecture. This design allows a single trained model to generalize to larger environments and accommodate varying team sizes and compositions. We evaluate CATMiP in a 2D grid-world simulation environment and compare its performance against planning-based exploration methods. Results demonstrate CATMiP's superior efficiency, scalability, and robustness to communication dropouts, highlighting its potential for real-world heterogeneous mobile robot systems. The code is available at https://github.com/mylad13/CATMiP.