Communication-Aware Multi-Agent Reinforcement Learning for Decentralized Cooperative UAV Deployment

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of cooperative deployment for drone swarms under partial observability and intermittent communication by proposing a graph neural network–based multi-agent reinforcement learning approach grounded in the centralized training with decentralized execution (CTDE) paradigm. The method employs a distance-constrained communication graph and introduces agent-entity attention alongside neighbor self-attention mechanisms, enabling efficient coordination using only local observations and messages from nearby agents. This architecture supports zero-shot generalization to formations of varying scales. Experimental results demonstrate that, in the DroneConnect task, a team of five drones achieves 74% area coverage—approaching the offline upper bound provided by mixed-integer linear programming—and significantly outperforms non-communicating baselines in the DroneCombat task, thereby validating the approach’s effectiveness and scalability.

Technology Category

Application Category

📝 Abstract

Autonomous Unmanned Aerial Vehicle (UAV) swarms are increasingly used as rapidly deployable aerial relays and sensing platforms, yet practical deployments must operate under partial observability and intermittent peer-to-peer links. We present a graph-based multi-agent reinforcement learning framework trained under centralized training with decentralized execution (CTDE): a centralized critic and global state are available only during training, while each UAV executes a shared policy using local observations and messages from nearby neighbors. Our architecture encodes local agent state and nearby entities with an agent-entity attention module, and aggregates inter-UAV messages with neighbor self-attention over a distance-limited communication graph. We evaluate primarily on a cooperative relay deployment task (DroneConnect) and secondarily on an adversarial engagement task (DroneCombat). In DroneConnect, the proposed method achieves high coverage under restricted communication and partial observation (e.g. 74% coverage with M = 5 UAVs and N = 10 nodes) while remaining competitive with a mixed-integer linear programming (MILP) optimization-based offline upper bound, and it generalizes to unseen team sizes without fine-tuning. In the adversarial setting, the same framework transfers without architectural changes and improves win rate over non-communicating baselines.

Problem

Research questions and friction points this paper is trying to address.

multi-agent reinforcement learning

decentralized cooperation

UAV deployment

partial observability

intermittent communication

Innovation

Methods, ideas, or system contributions that make the work stand out.

communication-aware MARL

graph-based attention

decentralized UAV coordination

CTDE

partial observability

🔎 Similar Papers

No similar papers found.

Authors to Follow