🤖 AI Summary
Modeling social norms in embodied navigation remains an open challenge. This paper introduces the first brain–behavior hierarchical foundation model for socially aware navigation. It constructs the first large-scale cognitive activation dataset and an expert trajectory pyramid, and proposes SAFE-GRPO—the first flow-based reinforcement learning framework that explicitly models and rewards socially compliant behaviors. The approach integrates imitation learning, cognitive reasoning signal modeling, and multi-source expert demonstrations. Experiments demonstrate that our method achieves 38% and 46% improvements over state-of-the-art methods in success rate and social compliance rate, respectively. It is the first to jointly optimize high-level social rule understanding and low-level compliant trajectory generation, yielding dual breakthroughs in both navigation performance and social adaptability.
📝 Abstract
Embodied navigation that adheres to social norms remains an open research challenge. Our extbf{SocialNav} is a foundational model for socially-aware navigation with a hierarchical "brain-action" architecture, capable of understanding high-level social norms and generating low-level, socially compliant trajectories. To enable such dual capabilities, we construct the SocNav Dataset, a large-scale collection of 7 million samples, comprising (1) a Cognitive Activation Dataset providing social reasoning signals such as chain-of-thought explanations and social traversability prediction, and (2) an Expert Trajectories Pyramid aggregating diverse navigation demonstrations from internet videos, simulated environments, and real-world robots. A multi-stage training pipeline is proposed to gradually inject and refine navigation intelligence: we first inject general navigation skills and social norms understanding into the model via imitation learning, and then refine such skills through a deliberately designed Socially-Aware Flow Exploration GRPO (SAFE-GRPO), the first flow-based reinforcement learning framework for embodied navigation that explicitly rewards socially compliant behaviors. SocialNav achieves +38% success rate and +46% social compliance rate compared to the state-of-the-art method, demonstrating strong gains in both navigation performance and social compliance. Our project page: https://amap-eai.github.io/SocialNav/