Inference of Intrinsic Rewards and Fairness in Multi-Agent Systems

📅 2025-09-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of inferring latent, unobservable social preferences—particularly fairness inclinations and intrinsic reward structures—of agents in multi-agent systems, where explicit preference annotations are unavailable. We propose a Bayesian inverse reinforcement learning framework that models agents’ concern for others’ welfare to disentangle fairness components from their composite reward functions. Our approach introduces a grouped incentive mechanism distinguishing procedural fairness, distributive fairness, and other dimensions, and jointly infers fairness preferences from equilibrium demonstrations in both normal-form and Markov games. Experiments in stochastic environments and a collaborative cooking task demonstrate that our method reliably identifies fairness preferences, significantly enhancing the interpretability and separability of social preferences. To our knowledge, this is the first verifiable inference paradigm for fairness alignment in multi-agent systems.

Technology Category

Application Category

📝 Abstract

From altruism to antagonism, fairness plays a central role in social interactions. But can we truly understand how fair someone is, especially without explicit knowledge of their preferences? We cast this challenge as a multi-agent inverse reinforcement learning problem, explicitly structuring rewards to reflect how agents value the welfare of others. We introduce novel Bayesian strategies, reasoning about the optimality of demonstrations and characterisation of equilibria in general-sum Markov games. Our experiments, spanning randomised environments and a collaborative cooking task, reveal that coherent notions of fairness can be reliably inferred from demonstrations. Furthermore, when isolating fairness components, we obtain a disentangled understanding of agents preferences. Crucially, we unveil that by placing agents in different groups, we can force them to exhibit new facets of their reward structures, cutting through ambiguity to answer the central question: who is being fair?

Problem

Research questions and friction points this paper is trying to address.

Infer intrinsic rewards and fairness in multi-agent systems from demonstrations

Understand agent fairness without explicit knowledge of their preferences

Disentangle agent preferences through structured reward inference techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian inverse reinforcement learning for fairness

General-sum Markov games equilibrium characterization

Disentangled agent preference inference through grouping

🔎 Similar Papers

No similar papers found.

Authors to Follow