🤖 AI Summary
This work proposes a novel paradigm that integrates diffusion models with multi-agent reinforcement learning (MARL) to address the limitations of centralized and purely decentralized approaches in multi-robot coordination. Centralized planning suffers from poor scalability, while fully decentralized methods struggle to effectively model inter-agent interactions. The proposed method enables each robot to generate trajectories independently using single-agent data, while incorporating a centrally trained MARL value function to guide the reverse denoising process of the diffusion model via gradient-based refinement. This approach achieves interaction-aware coordinated planning without requiring joint modeling or retraining for varying numbers of robots. By leveraging exponential tilting for distribution adjustment, the method reduces agent interference from 55.4% to 41.8% in a four-robot maze navigation simulation, significantly enhancing coordination performance while maintaining strong scalability.
📝 Abstract
Coordinating multiple robots in shared environments requires generating feasible trajectories for each agent while accounting for interactions among agents. Centralized planning approaches become difficult to scale as the number of robots increases, while decentralized approaches that allow each agent to plan independently do not inherently account for inter-agent interactions. This paper presents a framework for coordinated multi-robot motion planning that combines decentralized generative trajectory planning with multi-agent reinforcement learning (MARL)-based coordination. Each robot independently generates candidate trajectories using a diffusion model trained on single-agent motion data, leveraging the generative model's ability to produce feasible and diverse trajectories. To reduce conflicts between agents, a centralized value function trained via MARL guides the reverse diffusion process through gradient-based steering, enabling interaction-aware trajectory generation without centralized joint planning or retraining of the generative model. This guidance follows an exponential tilting formulation, in which the value function biases the denoising distribution toward trajectories with higher expected multi-agent return. The framework is evaluated in a simulated maze environment with four mobile robots. Experimental results show that the proposed value-guided diffusion planning reduces the inter-agent interference rate from 55.4% to 41.8%, demonstrating that coordination can be effectively achieved while preserving the scalability of decentralized trajectory generation. These results suggest that MARL-based value guidance can effectively introduce coordination into decentralized generative planners without requiring a fully joint multi-robot model.