🤖 AI Summary
To address the challenge of balancing low latency and semantic richness in real-time navigation for visually impaired users, this paper proposes VAG-EC, a cognition-inspired emergent communication framework. VAG-EC innovatively integrates knowledge graph modeling and task-driven graph-structured attention into emergent communication: it encodes object relationships via a knowledge graph and employs selective attention to generate compact, interpretable, and context-sensitive symbolic tactile instructions. The method jointly optimizes the communication protocol by unifying graph neural networks with multi-agent reinforcement learning. Evaluated under the TopSim and CI benchmarks, VAG-EC achieves 23.6% and 18.4% improvements in multi-scale vocabulary coverage and message length efficiency, respectively—outperforming all baselines significantly. Moreover, it enables millisecond-level vibrotactile feedback, effectively reconciling real-time responsiveness with deep semantic expressivity.
📝 Abstract
Assistive systems for visually impaired individuals must deliver rapid, interpretable, and adaptive feedback to facilitate real-time navigation. Current approaches face a trade-off between latency and semantic richness: natural language-based systems provide detailed guidance but are too slow for dynamic scenarios, while emergent communication frameworks offer low-latency symbolic languages but lack semantic depth, limiting their utility in tactile modalities like vibration. To address these limitations, we introduce a novel framework, Cognitively-Inspired Emergent Communication via Knowledge Graphs (VAG-EC), which emulates human visual perception and cognitive mapping. Our method constructs knowledge graphs to represent objects and their relationships, incorporating attention mechanisms to prioritize task-relevant entities, thereby mirroring human selective attention. This structured approach enables the emergence of compact, interpretable, and context-sensitive symbolic languages. Extensive experiments across varying vocabulary sizes and message lengths demonstrate that VAG-EC outperforms traditional emergent communication methods in Topographic Similarity (TopSim) and Context Independence (CI). These findings underscore the potential of cognitively grounded emergent communication as a fast, adaptive, and human-aligned solution for real-time assistive technologies. Code is available at https://github.com/Anonymous-NLPcode/Anonymous_submission/tree/main.