๐ค AI Summary
High renewable energy penetration introduces high-dimensional combinatorial action spaces in power system topology control, rendering conventional single-agent and fully decentralized approaches inadequate for balancing efficiency and global consistency. To address this, we propose a Center-Coordinated Multi-Agent (CCMA) architecture: regional agents generate local action proposals, while a central coordinator performs global filtering and decision-makingโthereby decoupling the action space and enabling collaborative optimization. Our method integrates policy gradient training, a proposal-filtering mechanism, and an L2RPN-customized reward function. Evaluated across multiple L2RPN benchmarks, CCMA achieves a 37% improvement in sample efficiency and a 22% increase in critical fault recovery success rate over baseline methods. It also demonstrates superior convergence stability and robustness under varying operational conditions. These results indicate strong potential for real-world deployment in modern power grids.
๐ Abstract
Power grid operation is becoming more complex due to the increase in generation of renewable energy. The recent series of Learning To Run a Power Network (L2RPN) competitions have encouraged the use of artificial agents to assist human dispatchers in operating power grids. However, the combinatorial nature of the action space poses a challenge to both conventional optimizers and learned controllers. Action space factorization, which breaks down decision-making into smaller sub-tasks, is one approach to tackle the curse of dimensionality. In this study, we propose a centrally coordinated multi-agent (CCMA) architecture for action space factorization. In this approach, regional agents propose actions and subsequently a coordinating agent selects the final action. We investigate several implementations of the CCMA architecture, and benchmark in different experimental settings against various L2RPN baseline approaches. The CCMA architecture exhibits higher sample efficiency and superior final performance than the baseline approaches. The results suggest high potential of the CCMA approach for further application in higher-dimensional L2RPN as well as real-world power grid settings.