PGLearn -- An Open-Source Learning Toolkit for Optimal Power Flow

📅 2025-05-28

📈 Citations: 1

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Current machine learning (ML) research on optimal power flow (OPF) suffers from scarce benchmark datasets, inconsistent evaluation protocols, and poor reproducibility. To address these challenges, we introduce the first open-source ML benchmark platform specifically designed for OPF. Our method integrates three core innovations: (1) the first large-scale, real-world power grid dataset incorporating temporal dynamics; (2) unified support for AC, DC, and second-order cone programming (SOCP) OPF formulations, coupled with joint global–local operating condition representation; and (3) an end-to-end, standardized pipeline—covering OPF instance generation, multi-formulation power flow modeling, ML training, and reproducible evaluation—deployed on Hugging Face. The platform provides publicly accessible datasets spanning mainstream grid scales and a unified evaluation toolkit. By lowering entry barriers and enabling fair, comparable, and reproducible ML-based OPF validation, it establishes a foundational infrastructure for advancing data-driven power system optimization.

Technology Category

Application Category

📝 Abstract

Machine Learning (ML) techniques for Optimal Power Flow (OPF) problems have recently garnered significant attention, reflecting a broader trend of leveraging ML to approximate and/or accelerate the resolution of complex optimization problems. These developments are necessitated by the increased volatility and scale in energy production for modern and future grids. However, progress in ML for OPF is hindered by the lack of standardized datasets and evaluation metrics, from generating and solving OPF instances, to training and benchmarking machine learning models. To address this challenge, this paper introduces PGLearn, a comprehensive suite of standardized datasets and evaluation tools for ML and OPF. PGLearn provides datasets that are representative of real-life operating conditions, by explicitly capturing both global and local variability in the data generation, and by, for the first time, including time series data for several large-scale systems. In addition, it supports multiple OPF formulations, including AC, DC, and second-order cone formulations. Standardized datasets are made publicly available to democratize access to this field, reduce the burden of data generation, and enable the fair comparison of various methodologies. PGLearn also includes a robust toolkit for training, evaluating, and benchmarking machine learning models for OPF, with the goal of standardizing performance evaluation across the field. By promoting open, standardized datasets and evaluation metrics, PGLearn aims at democratizing and accelerating research and innovation in machine learning applications for optimal power flow problems. Datasets are available for download at https://www.huggingface.co/PGLearn.

Problem

Research questions and friction points this paper is trying to address.

Lack of standardized datasets for ML in OPF problems

Absence of unified evaluation metrics for OPF methodologies

Need for tools to accelerate ML-based OPF research

Innovation

Methods, ideas, or system contributions that make the work stand out.

Standardized datasets for ML and OPF

Includes time series data for large systems

Supports multiple OPF formulations

🔎 Similar Papers

Beyond the Neural Fog: Interpretable Learning for AC Optimal Power Flow

2024-07-30arXiv.orgCitations: 0

💼 Related Jobs

Machine Learning Engineer