🤖 AI Summary
Traditional reinforcement learning faces significant limitations in communication networks due to low sample efficiency, difficulties in modeling long-term dependencies, and partial observability. This work provides a systematic review of Transformer-based reinforcement learning approaches, which leverage self-attention mechanisms to effectively capture long-range dependencies and global correlations, thereby substantially enhancing heterogeneous data processing capabilities, training efficiency, and adaptability in partially observable environments. For the first time, it comprehensively surveys the theoretical foundations and application advances of this paradigm in key networking tasks such as resource allocation, computation offloading, and routing optimization. The study further outlines promising future research directions toward semantic communication and intelligent network optimization, clarifying the effectiveness, critical challenges, and open problems in advancing intelligent decision-making performance for communication networks.
📝 Abstract
Reinforcement Learning (RL) has long been a powerful solution to various problems in communication networks. However, traditional RL models still face with several limitations. Not only do they rely on large numbers of interactions with the environment, but they are also limited in terms of modeling long-term relationships and tackling partial observability. In recent years, the Transformer model has demonstrated the ability to enhance RL models, allowing them to overcome these issues. Particularly, the self-attention mechanism within the Transformer enables efficient modeling of long-range dependencies and global correlations, as well as accelerates training processes and handles heterogeneous data modalities. In this paper, we present a comprehensive survey of Transformer-based RL algorithms and their applications in communication networks. Specifically, the paper provides the mathematical background of RL and Transformer architectures, along with insights into key issues such as resource allocation, computation offloading, routing, and trajectory control, and network security. We conclude the paper by discussing challenges, open issues, and notable future research directions, including Transformer-enhanced DRL algorithms for semantic communication and network optimization.