🤖 AI Summary
This paper addresses the mutual information optimal control problem (MIOCP) for discrete-time linear systems, aiming to jointly optimize both the control policy and the prior distribution—thereby relaxing the restrictive assumption in maximum entropy optimal control (MEOCP) that the prior is fixed as uniform. Under Gaussian policy and Gaussian prior assumptions, we derive closed-form analytical solutions for both the policy and the prior via variational inference, and propose an alternating minimization algorithm with provable convergence. To the best of our knowledge, this is the first work to incorporate mutual information as the control objective within the discrete linear system framework, significantly enhancing policy expressiveness and improving the exploration-exploitation trade-off. Numerical experiments demonstrate superior control performance and robustness compared to baseline methods, establishing a novel information-theoretic paradigm for optimal control.
📝 Abstract
In this paper, we formulate a mutual information optimal control problem (MIOCP) for discrete-time linear systems. This problem can be regarded as an extension of a maximum entropy optimal control problem (MEOCP). Differently from the MEOCP where the prior is fixed to the uniform distribution, the MIOCP optimizes the policy and prior simultaneously. As analytical results, under the policy and prior classes consisting of Gaussian distributions, we derive the optimal policy and prior of the MIOCP with the prior and policy fixed, respectively. Using the results, we propose an alternating minimization algorithm for the MIOCP. Through numerical experiments, we discuss how our proposed algorithm works.