Learning to cooperate with emergent reputation via multi-agent reinforcement learning

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Designing generalizable and adaptive reputation mechanisms to promote cooperation in distributed multi-agent social dilemmas under perceptual and cognitive constraints remains a significant challenge. This work proposes COOPER, a novel approach that achieves, for the first time, end-to-end joint learning of reputation assessment rules and cooperative policies without relying on predefined rules or intrinsic rewards. By employing a modular architecture, COOPER mitigates feedback delays and noise arising from the coupling between reputation and policy learning. The method consistently elicits emergent cooperative behaviors and stable reputation norms in grid-world donation and coin-collecting games, demonstrating strong robustness and adaptability across diverse social network topologies and opponent strategies.

📝 Abstract

Reputation, the aggregation of peer assessments diffused through social networks, is a pivotal mechanism for promoting cooperation in social dilemmas ubiquitous to distributed multi-agent systems comprising agents with limited perception and cognitive capabilities. Exploring efficient reputation systems, comprising reputation assessment rules and reputation-based policies, is a long-standing challenge. Previous work assumes predefined reputation assessment rules or models reputation as an intrinsic reward to learn policies, compromising the methods' ability for generalization and adaptation. To address this, we propose a distributed multi-agent reinforcement learning method $\textbf{COOPER}$ ($\textbf{COOP}$eration with $\textbf{E}$mergent $\textbf{R}$eputation), which jointly learns reputation assessment rules and reputation-based policies entirely from environment rewards. Notably, leveraging the underlying mechanisms of reputation, we deliberately design the constituent modules of $\textbf{COOPER}$ and the data flows among them, overcoming the latency and noise in the feedback signal, caused by the deep entanglement between reputation and policy. Experiments on the donation game and the coin game in grid world environments demonstrate that $\textbf{COOPER}$ effectively adapts to various existing reputation systems and co-players. Furthermore, we observe the co-emergence of reputation norms and cooperation in self-play settings. These results hold robustly across diverse social network topologies, underscoring the generalizability and efficacy of our approach.

Problem

Research questions and friction points this paper is trying to address.

reputation

cooperation

multi-agent reinforcement learning

social dilemmas

distributed systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent reinforcement learning

emergent reputation

cooperation

reputation assessment

social dilemmas

🔎 Similar Papers

Reciprocal Reward Influence Encourages Cooperation From Self-Interested Agents

2024-06-03arXiv.orgCitations: 0