Communication-Aware Multi-Agent Reinforcement Learning for Decentralized Cooperative UAV Deployment

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of cooperative deployment for drone swarms under partial observability and intermittent communication by proposing a graph neural network–based multi-agent reinforcement learning approach grounded in the centralized training with decentralized execution (CTDE) paradigm. The method employs a distance-constrained communication graph and introduces agent-entity attention alongside neighbor self-attention mechanisms, enabling efficient coordination using only local observations and messages from nearby agents. This architecture supports zero-shot generalization to formations of varying scales. Experimental results demonstrate that, in the DroneConnect task, a team of five drones achieves 74% area coverage—approaching the offline upper bound provided by mixed-integer linear programming—and significantly outperforms non-communicating baselines in the DroneCombat task, thereby validating the approach’s effectiveness and scalability.

Technology Category

Application Category

📝 Abstract
Autonomous Unmanned Aerial Vehicle (UAV) swarms are increasingly used as rapidly deployable aerial relays and sensing platforms, yet practical deployments must operate under partial observability and intermittent peer-to-peer links. We present a graph-based multi-agent reinforcement learning framework trained under centralized training with decentralized execution (CTDE): a centralized critic and global state are available only during training, while each UAV executes a shared policy using local observations and messages from nearby neighbors. Our architecture encodes local agent state and nearby entities with an agent-entity attention module, and aggregates inter-UAV messages with neighbor self-attention over a distance-limited communication graph. We evaluate primarily on a cooperative relay deployment task (DroneConnect) and secondarily on an adversarial engagement task (DroneCombat). In DroneConnect, the proposed method achieves high coverage under restricted communication and partial observation (e.g. 74% coverage with M = 5 UAVs and N = 10 nodes) while remaining competitive with a mixed-integer linear programming (MILP) optimization-based offline upper bound, and it generalizes to unseen team sizes without fine-tuning. In the adversarial setting, the same framework transfers without architectural changes and improves win rate over non-communicating baselines.
Problem

Research questions and friction points this paper is trying to address.

multi-agent reinforcement learning
decentralized cooperation
UAV deployment
partial observability
intermittent communication
Innovation

Methods, ideas, or system contributions that make the work stand out.

communication-aware MARL
graph-based attention
decentralized UAV coordination
CTDE
partial observability
🔎 Similar Papers
No similar papers found.
Enguang Fan
Enguang Fan
Department of Computer Science, University of Illinois at Urbana-Champaign
Mobile ComputingWireless
Y
Yifan Chen
University of Illinois at Urbana-Champaign
Z
Zihan Shan
University of Illinois at Urbana-Champaign
Matthew Caesar
Matthew Caesar
Professor of Computer Science, University of Illinois
Systems and networking
J
Jae Kim
Boeing Research and Technology