🤖 AI Summary
To address the poor generalization and weak robustness of deep reinforcement learning (DRL) agents in dynamic environments, this paper proposes EDEN—a biologically constrained navigation framework inspired by the mammalian entorhinal cortex–hippocampus system. Methodologically, we introduce the first trainable grid cell encoder that self-organizes periodic spatial representations directly from raw visual features (DINO embeddings) and egomotion sensor inputs, enabling egocentric path integration and vector-based navigation; this neural mechanism is tightly integrated with Proximal Policy Optimization (PPO). Our contributions are threefold: (1) the first end-to-end differentiable and trainable grid cell encoder; (2) a hybrid navigation paradigm combining high interpretability with strong generalization; and (3) state-of-the-art performance—>94% success rate in complex occluded scenes and 99% in simple scenes across MiniWorld and Gazebo simulators—with significant step efficiency gains, validated on a real-world robotic platform.
📝 Abstract
Deep reinforcement learning agents are often fragile while humans remain adaptive and flexible to varying scenarios. To bridge this gap, we present EDEN, a biologically inspired navigation framework that integrates learned entorhinal-like grid cell representations and reinforcement learning to enable autonomous navigation. Inspired by the mammalian entorhinal-hippocampal system, EDEN allows agents to perform path integration and vector-based navigation using visual and motion sensor data. At the core of EDEN is a grid cell encoder that transforms egocentric motion into periodic spatial codes, producing low-dimensional, interpretable embeddings of position. To generate these activations from raw sensory input, we combine fiducial marker detections in the lightweight MiniWorld simulator and DINO-based visual features in the high-fidelity Gazebo simulator. These spatial representations serve as input to a policy trained with Proximal Policy Optimization (PPO), enabling dynamic, goal-directed navigation. We evaluate EDEN in both MiniWorld, for rapid prototyping, and Gazebo, which offers realistic physics and perception noise. Compared to baseline agents using raw state inputs (e.g., position, velocity) or standard convolutional image encoders, EDEN achieves a 99% success rate, within the simple scenarios, and>94% within complex floorplans with occluded paths with more efficient and reliable step-wise navigation. In addition, as a replacement of ground truth activations, we present a trainable Grid Cell encoder enabling the development of periodic grid-like patterns from vision and motion sensor data, emulating the development of such patterns within biological mammals. This work represents a step toward biologically grounded spatial intelligence in robotics, bridging neural navigation principles with reinforcement learning for scalable deployment.