๐ค AI Summary
This work establishes convergence guarantees for deep Actor-Critic algorithms solving Mean Field Games (MFGs) and Mean Field Control (MFC) problems in continuous state-action spaces under infinite-horizon settings, and extends the analysis to Mean Field Control Games (MFCGs)โa novel class featuring both local cooperation and global competition. Methodologically, we propose a unified two-timescale analysis framework based on learning rate ratios, rigorously distinguishing the limiting dynamics of MFGs and MFCs; we further introduce state-action space discretization to ensure identifiability of limiting behaviors in continuous domains. Theoretically, we prove global convergence of the algorithm for a class of linear-quadratic MFCGs. Numerical experiments demonstrate high-precision approximation to explicit optimal solutions. To the best of our knowledge, this is the first convergence guarantee for mean-field reinforcement learning in infinite-horizon, continuous-state-action settings.
๐ Abstract
We establish the convergence of the deep actor-critic reinforcement learning algorithm presented in [Angiuli et al., 2023a] in the setting of continuous state and action spaces with an infinite discrete-time horizon. This algorithm provides solutions to Mean Field Game (MFG) or Mean Field Control (MFC) problems depending on the ratio between two learning rates: one for the value function and the other for the mean field term. In the MFC case, to rigorously identify the limit, we introduce a discretization of the state and action spaces, following the approach used in the finite-space case in [Angiuli et al., 2023b]. The convergence proofs rely on a generalization of the two-timescale framework introduced in [Borkar, 1997]. We further extend our convergence results to Mean Field Control Games, which involve locally cooperative and globally competitive populations. Finally, we present numerical experiments for linear-quadratic problems in one and two dimensions, for which explicit solutions are available.