π€ AI Summary
This study addresses a critical gap in existing institutional incentive mechanisms, which typically prioritize minimizing costs or maximizing cooperation frequency while overlooking the optimization of social welfareβdefined as total benefits minus institutional expenditures. The authors propose a novel incentive framework centered on social welfare within finite mixed populations facing social dilemmas, jointly incorporating rewards and punishments through evolutionary game theory. For the first time, they systematically quantify the discrepancy between conventional objectives and welfare-optimal incentives. By analytically deriving explicit welfare expressions in donation and public goods games, they reveal unimodal or multimodal phase transitions in the welfare function and demonstrate that non-zero optimal incentives concentrate near simple closed-form solutions. The work further establishes closed-form conditions under which rewards outperform punishments and introduces a computationally efficient strategy for welfare-maximizing incentives, whose superiority and non-monotonic behavior are validated across diverse parameter regimes.
π Abstract
Institutional incentives are widely used to promote cooperation among autonomous, self-regarding agents, from human societies to multi-agent and AI systems. Existing work typically treats incentive design as a bi-objective problem: minimise institutional cost while achieving a high long-run frequency of cooperation. Whether such schemes also maximise social welfare - total population payoff net of institutional expenditure - has remained largely unexplored. We develop a welfare-centric framework for institutional incentives in finite, well-mixed populations playing a social dilemma (Donation Game and Public Goods Game), considering both rewards for cooperators and punishments for defectors. For each mechanism, we derive explicit expressions for expected social welfare and characterise how it depends on incentive efficiency and selection intensity. Analytically, we identify parameter regimes where social welfare has a single optimal incentive level and regimes with qualitative phase transitions, in which welfare becomes non-monotonic with multiple local optima. We prove that any welfare-maximising incentive is either zero or concentrated around a simple closed-form target, and we provide an efficient algorithm to compute these optima. Comparing reward and punishment, we further derive close-formed conditions under which reward outperform punishment in terms of social welfare for any given budget. Overall, our results reveal a systematic gap between incentives optimised for cost or cooperation frequency and those that maximise welfare.