🤖 AI Summary
This work addresses the slow policy learning and poor generalization in deep reinforcement learning (DRL) caused by inadequate modeling of continuous action spaces. We propose embedding a ring attractor—a differentiable, neurobiologically inspired dynamical system that explicitly encodes spatial structure—as an action representation module within DRL frameworks. To our knowledge, this is the first method enabling end-to-end joint training of ring attractors with both DQN and policy gradient architectures, supporting both exogenous modeling and endogenous integration paradigms. The ring attractor intrinsically captures topological action structures (e.g., rotational angles, tactical adjacency). Evaluated on the Atari 100k low-data benchmark, our approach achieves a 53% average performance gain over prior state-of-the-art methods, with substantial improvements in learning efficiency and policy precision. The implementation is publicly available.
📝 Abstract
This paper explores the integration of ring attractors, a mathematical model inspired by neural circuit dynamics, into the Reinforcement Learning (RL) action selection process. Serving as specialized brain-inspired structures that encode spatial information and uncertainty, ring attractors offer a biologically plausible mechanism to improve learning speed and accuracy in RL. They do so by explicitly encoding the action space, facilitating the organization of neural activity, and enabling the distribution of spatial representations across the neural network in the context of Deep Reinforcement Learning (DRL). For example, preserving the continuity between rotation angles in robotic control or adjacency between tactical moves in game-like environments. The application of ring attractors in the action selection process involves mapping actions to specific locations on the ring and decoding the selected action based on neural activity. We investigate the application of ring attractors by both building an exogenous model and integrating them as part of DRL agents. Our approach significantly improves state-of-the-art performance on the Atari 100k benchmark, achieving a 53% increase in performance across selected state-of-the-art baselines. Codebase available at https://anonymous.4open.science/r/RA_RL-8026.