🤖 AI Summary
Conventional PID controllers exhibit poor adaptability in plasma shape control, while existing reinforcement learning (RL) methods suffer from weak generalization and require task-specific retraining. Method: This paper proposes a zero-shot generative RL framework leveraging large-scale offline PID demonstration data. It innovatively integrates Generative Adversarial Imitation Learning (GAIL) with Hilbert-space representation learning to embed geometric structure into the latent space, enabling zero-shot cross-trajectory policy transfer without online fine-tuning. Contribution/Results: The framework achieves stable and precise tracking of diverse plasma shape reference trajectories on the HL-3 tokamak simulator. It significantly enhances control flexibility and historical data utilization efficiency, offering a scalable, robust paradigm for intelligent control of fusion devices.
📝 Abstract
Traditional PID controllers have limited adaptability for plasma shape control, and task-specific reinforcement learning (RL) methods suffer from limited generalization and the need for repetitive retraining. To overcome these challenges, this paper proposes a novel framework for developing a versatile, zero-shot control policy from a large-scale offline dataset of historical PID-controlled discharges. Our approach synergistically combines Generative Adversarial Imitation Learning (GAIL) with Hilbert space representation learning to achieve dual objectives: mimicking the stable operational style of the PID data and constructing a geometrically structured latent space for efficient, goal-directed control. The resulting foundation policy can be deployed for diverse trajectory tracking tasks in a zero-shot manner without any task-specific fine-tuning. Evaluations on the HL-3 tokamak simulator demonstrate that the policy excels at precisely and stably tracking reference trajectories for key shape parameters across a range of plasma scenarios. This work presents a viable pathway toward developing highly flexible and data-efficient intelligent control systems for future fusion reactors.