Referential communication in heterogeneous communities of pre-trained visual deep networks

📅 2023-02-04
🏛️ Adaptive Agents and Multi-Agent Systems
📈 Citations: 6
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether heterogeneous pre-trained vision models can spontaneously establish a shared referential protocol for cross-architecture, cross-paradigm semantic communication in an unsupervised setting. Method: We propose a multi-agent self-supervised referential game framework that integrates contrastive learning, feature alignment, and protocol distillation to enable collaborative construction of transferable high-level semantic representations. Contribution/Results: Experiments show that models achieve 89% inter-model referential accuracy; new agents rapidly adapt to the established protocol with minimal interaction; the learned protocol explicitly encodes object-level semantics and generalizes to unseen categories. To our knowledge, this is the first work demonstrating self-organized semantic communication across vision models trained under disparate paradigms—e.g., masked autoencoding, contrastive learning, and supervised pretraining—thereby establishing a novel paradigm for interpretable, cooperative neural network systems.
📝 Abstract
As large pre-trained image-processing neural networks are being embedded in autonomous agents such as self-driving cars or robots, the question arises of how such systems can communicate with each other about the surrounding world, despite their different architectures and training regimes. As a first step in this direction, we systematically explore the task of referential communication in a community of heterogeneous state-of-the-art pre-trained visual networks, showing that they can develop, in a self-supervised way, a shared protocol to refer to a target object among a set of candidates. This shared protocol can also be used, to some extent, to communicate about previously unseen object categories of different granularity. Moreover, a visual network that was not initially part of an existing community can learn the community's protocol with remarkable ease. Finally, we study, both qualitatively and quantitatively, the properties of the emergent protocol, providing some evidence that it is capturing high-level semantic features of objects.
Problem

Research questions and friction points this paper is trying to address.

Cross-network Communication
Shared Representation
Object Recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inter-network Communication
Shared Representation Learning
Object Recognition
🔎 Similar Papers
No similar papers found.