🤖 AI Summary
This work addresses the operational uncertainties introduced by high-penetration renewable energy integration and the lack of high-quality evaluation benchmarks for large language models (LLMs) in professional-grade optimal power flow (OPF) modeling. To bridge this gap, the authors propose the NL-to-OPF framework, which integrates natural language processing, LLMs, and domain expertise to automatically generate and verify executable OPF code from natural-language descriptions of dispatch requirements. The core contribution is the first-ever OPF modeling dataset and benchmark tailored to power system scenarios—ProOPF-D and ProOPF-B—comprising 12K training samples and 121 expert-annotated test cases. This benchmark enables end-to-end evaluation of both concrete and abstract OPF modeling tasks, establishing a rigorous and reproducible standard for assessing LLM capabilities in power system optimization.
📝 Abstract
Growing renewable penetration introduces substantial uncertainty into power system operations, necessitating frequent adaptation of dispatch objectives and constraints and challenging expertise-intensive, near-real-time modeling workflows. Large Language Models (LLMs) provide a promising avenue for automating this process by translating natural-language (NL) operational requirements into executable optimization models via semantic reasoning and code synthesis. Yet existing LLM datasets and benchmarks for optimization modeling primarily target coarse-grained cross-domain generalization, offering limited, rigorous evaluation in power-system settings, particularly for Optimal Power Flow (OPF). We therefore introduce \textbf{ProOPF-D} and \textbf{ProOPF-B}, a dataset and benchmark for professional-grade OPF modeling: ProOPF-D contains 12K instances pairing NL requests with parameter adjustments and structural extensions to a canonical OPF, together with executable implementations; ProOPF-B provides 121 expert-annotated test cases with ground-truth code, enabling end-to-end evaluation under both concrete and abstract OPF modeling regimes.