Mimicking Human Intuition: Cognitive Belief-Driven Q-Learning

📅 2024-10-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low sample efficiency, poor robustness, and limited interpretability in reinforcement learning, this paper proposes a cognitive belief-driven Q-learning framework. Methodologically, it explicitly incorporates human-like intuitive subjective beliefs into Q-learning for the first time, introducing a clustering-based representation of subjective belief distributions and integrating Bayesian belief updating to enable joint reasoning over historical experience and current context—thereby effectively mitigating Q-value overestimation. The end-to-end framework synergistically unifies principles from cognitive science, clustering-based representation learning, and classical Q-learning. Empirical evaluation across diverse complex discrete-control tasks demonstrates substantial improvements in policy robustness and environmental adaptability; decision-making behavior exhibits greater alignment with human intuition. The framework consistently outperforms state-of-the-art Q-learning variants across all major performance metrics.

Technology Category

Application Category

📝 Abstract
Reinforcement learning encounters challenges in various environments related to robustness and explainability. Traditional Q-learning algorithms cannot effectively make decisions and utilize the historical learning experience. To overcome these limitations, we propose Cognitive Belief-Driven Q-Learning (CBDQ), which integrates subjective belief modeling into the Q-learning framework, enhancing decision-making accuracy by endowing agents with human-like learning and reasoning capabilities. Drawing inspiration from cognitive science, our method maintains a subjective belief distribution over the expectation of actions, leveraging a cluster-based subjective belief model that enables agents to reason about the potential probability associated with each decision. CBDQ effectively mitigates overestimated phenomena and optimizes decision-making policies by integrating historical experiences with current contextual information, mimicking the dynamics of human decision-making. We evaluate the proposed method on discrete control benchmark tasks in various complicate environments. The results demonstrate that CBDQ exhibits stronger adaptability, robustness, and human-like characteristics in handling these environments, outperforming other baselines. We hope this work will give researchers a fresh perspective on understanding and explaining Q-learning.
Problem

Research questions and friction points this paper is trying to address.

Improving sample efficiency in reinforcement learning methods
Guiding agents toward informative decision-making using cognitive principles
Enhancing decision-making under uncertainty with a belief system
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cognitive Belief-Driven Reinforcement Learning framework
Belief system optimizing action probabilities
Organizes state-action pairs for generalization
🔎 Similar Papers
No similar papers found.
Xingrui Gu
Xingrui Gu
Master Student, University of California, Berkeley
Learning TheoryHuman Centered AI
Guanren Qiao
Guanren Qiao
The Chinese University of HongKong, Shenzhen
reinforcement learningEmbodied AI (locomotion/manipulation)
C
Chuyi Jiang
Department of Electrical Engineering, Columbia University
T
Tianqing Xia
Department of Informatics, King’s College London
H
Hangyu Mao
Kuaishou Technology