🤖 AI Summary
The integration of Bayesian inference with reinforcement learning (RL) lacks a systematic, comprehensive survey. Method: This work provides the first holistic review of Bayesian RL for agent decision-making, covering model-based, model-free, and inverse RL paradigms. It unifies variational inference, Bayesian deep learning, meta-learning, and active learning, and constructs the first end-to-end classification and comparative framework spanning the full pipeline—from data acquisition and processing to policy learning. Contribution/Results: We propose a cross-paradigm performance evaluation framework, quantitatively demonstrating Bayesian RL’s advantages in data efficiency, generalization, interpretability, and safety. We further elucidate its mechanistic roles in challenging settings—including partial observability, unknown reward functions, and multi-agent environments. Our synthesis delivers theoretical guidance and principled methodology selection criteria for developing safe, robust, and interpretable autonomous agents.
📝 Abstract
Bayesian inference has many advantages in decision making of agents (e.g. robotics/simulative agent) over a regular data-driven black-box neural network: Data-efficiency, generalization, interpretability, and safety where these advantages benefit directly/indirectly from the uncertainty quantification of Bayesian inference. However, there are few comprehensive reviews to summarize the progress of Bayesian inference on reinforcement learning (RL) for decision making to give researchers a systematic understanding. This paper focuses on combining Bayesian inference with RL that nowadays is an important approach in agent decision making. To be exact, this paper discusses the following five topics: 1) Bayesian methods that have potential for agent decision making. First basic Bayesian methods and models (Bayesian rule, Bayesian learning, and Bayesian conjugate models) are discussed followed by variational inference, Bayesian optimization, Bayesian deep learning, Bayesian active learning, Bayesian generative models, Bayesian meta-learning, and lifelong Bayesian learning. 2) Classical combinations of Bayesian methods with model-based RL (with approximation methods), model-free RL, and inverse RL. 3) Latest combinations of potential Bayesian methods with RL. 4) Analytical comparisons of methods that combine Bayesian methods with RL with respect to data-efficiency, generalization, interpretability, and safety. 5) In-depth discussions in six complex problem variants of RL, including unknown reward, partial-observability, multi-agent, multi-task, non-linear non-Gaussian, and hierarchical RL problems and the summary of how Bayesian methods work in the data collection, data processing and policy learning stages of RL to pave the way for better agent decision-making strategies.