🤖 AI Summary
To address the high computational overhead of conventional optimal control methods under frequently varying objective functions, this paper proposes a transferable control framework based on a Function Encoder (FE). The method adopts an offline–online decomposition paradigm: during offline training, a set of reusable neural basis functions is learned to construct a universal function space; during online deployment, only lightweight coefficient estimation is required for zero-shot adaptation to novel objective functions. By integrating imitation learning with data- and problem-description-driven projection mappings, the framework generalizes across diverse dynamical systems, state/action dimensions, and cost structures. Experiments demonstrate near-optimal performance on multiple nonlinear systems with minimal online computation, enabling deployment as a semi-global real-time feedback controller. The core contributions are the first introduction of a transferable function-space mechanism and an efficient offline–online decoupled architecture.
📝 Abstract
This paper presents a transferable solution method for optimal control problems with varying objectives using function encoder (FE) policies. Traditional optimization-based approaches must be re-solved whenever objectives change, resulting in prohibitive computational costs for applications requiring frequent evaluation and adaptation. The proposed method learns a reusable set of neural basis functions that spans the control policy space, enabling efficient zero-shot adaptation to new tasks through either projection from data or direct mapping from problem specifications. The key idea is an offline-online decomposition: basis functions are learned once during offline imitation learning, while online adaptation requires only lightweight coefficient estimation. Numerical experiments across diverse dynamics, dimensions, and cost structures show our method delivers near-optimal performance with minimal overhead when generalizing across tasks, enabling semi-global feedback policies suitable for real-time deployment.