🤖 AI Summary
Existing reinforcement learning (RL) methods struggle to simultaneously address complex system dynamics, stochastic uncertainties, long-horizon optimization objectives, and stringent physical constraints inherent in power systems. To bridge this gap, we introduce RL2Grid—the first RL benchmark tailored for realistic grid operations—developed in collaboration with the French transmission system operator RTE. RL2Grid integrates high-fidelity physics-based modeling, hard safety constraint embedding, expert-informed policy initialization, and a unified evaluation pipeline across multiple algorithms. Its key contributions include: (i) the first deep integration of an industrial-grade power system simulator with an RL framework; (ii) standardized state/action spaces, constraint-aware reward design, and multi-task benchmarks (e.g., voltage regulation, load shedding); and (iii) baseline performance across representative scenarios, revealing systematic limitations of current RL methods in constraint satisfaction, long-term stability, and policy interpretability—establishing a reproducible, verifiable scientific evaluation paradigm for deploying RL in real-world power systems.
📝 Abstract
Reinforcement learning (RL) can transform power grid operations by providing adaptive and scalable controllers essential for grid decarbonization. However, existing methods struggle with the complex dynamics, aleatoric uncertainty, long-horizon goals, and hard physical constraints that occur in real-world systems. This paper presents RL2Grid, a benchmark designed in collaboration with power system operators to accelerate progress in grid control and foster RL maturity. Built on a power simulation framework developed by RTE France, RL2Grid standardizes tasks, state and action spaces, and reward structures within a unified interface for a systematic evaluation and comparison of RL approaches. Moreover, we integrate real control heuristics and safety constraints informed by the operators' expertise to ensure RL2Grid aligns with grid operation requirements. We benchmark popular RL baselines on the grid control tasks represented within RL2Grid, establishing reference performance metrics. Our results and discussion highlight the challenges that power grids pose for RL methods, emphasizing the need for novel algorithms capable of handling real-world physical systems.