Transformer-based Multi-agent Reinforcement Learning for Separation Assurance in Structured and Unstructured Airspaces

📅 2026-01-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing optimization-based air traffic scheduling methods in handling stochasticity in Advanced Air Mobility (AAM) and the poor generalization of multi-agent reinforcement learning (MARL) approaches across diverse airspace structures. To overcome these challenges, the authors propose a MARL framework leveraging a relative polar coordinate state representation, integrated with a lightweight Transformer encoder and a decentralized conflict resolution strategy. The framework trains a universal policy capable of generating safe velocity advisories under varied traffic patterns and intersection angles. Experimental results demonstrate that the approach significantly enhances adaptability and scalability in both structured and unstructured airspace environments. Notably, a single-layer Transformer configuration achieves near-zero near-mid-air collision rates and outperforms deeper networks and pure attention-based baselines in terms of time spent in separation violations.

Technology Category

Application Category

📝 Abstract
Conventional optimization-based metering depends on strict adherence to precomputed schedules, which limits the flexibility required for the stochastic operations of Advanced Air Mobility (AAM). In contrast, multi-agent reinforcement learning (MARL) offers a decentralized, adaptive framework that can better handle uncertainty, required for safe aircraft separation assurance. Despite this advantage, current MARL approaches often overfit to specific airspace structures, limiting their adaptability to new configurations. To improve generalization, we recast the MARL problem in a relative polar state space and train a transformer encoder model across diverse traffic patterns and intersection angles. The learned model provides speed advisories to resolve conflicts while maintaining aircraft near their desired cruising speeds. In our experiments, we evaluated encoder depths of 1, 2, and 3 layers in both structured and unstructured airspaces, and found that a single encoder configuration outperformed deeper variants, yielding near-zero near mid-air collision rates and shorter loss-of-separation infringements than the deeper configurations. Additionally, we showed that the same configuration outperforms a baseline model designed purely with attention. Together, our results suggest that the newly formulated state representation, novel design of neural network architecture, and proposed training strategy provide an adaptable and scalable decentralized solution for aircraft separation assurance in both structured and unstructured airspaces.
Problem

Research questions and friction points this paper is trying to address.

multi-agent reinforcement learning
separation assurance
airspace generalization
Advanced Air Mobility
conflict resolution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based MARL
relative polar state space
decentralized separation assurance
generalization in airspaces
speed advisories
🔎 Similar Papers
No similar papers found.
A
Arsyi Aziz
Department of Computer Science, George Washington University, Washington D.C., United States
Peng Wei
Peng Wei
George Washington University
AviationControlOptimizationMachine LearningArtificial Intelligence