LMFPPO-UBP: Local Mean Field Proximal Policy Optimization with Unbalanced Punishment for Spatial Public Goods Games

📅 2026-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of sustaining cooperation in spatial public goods games, where high-dimensional state spaces and local externalities impede the stable emergence of cooperative behavior. The authors propose a deep reinforcement learning framework that integrates local mean-field modeling with an asymmetric punishment mechanism. Specifically, the mean field is reinterpreted as a social statistical sensor embedded within the policy gradient space, enabling agents to perceive neighborhood cooperation dynamics. Concurrently, an asymmetric punishment—targeted exclusively at defectors and scaled proportionally to local cooperator density—is introduced to reshape payoff structures and lower the barrier to cooperation. Implemented via the PPO algorithm, this approach achieves rapid and globally stable cooperation even under low enhancement factors, significantly outperforming baseline methods such as Q-learning and Fermi update rules, and demonstrating strong coordination ability and robustness in statistical validation.

Technology Category

Application Category

📝 Abstract
Spatial public goods games are characterized by high-dimensional state spaces and localized externalities, which pose significant challenges for achieving stable and widespread cooperation. Traditional approaches often struggle to effectively capture neighborhood-level strategic interactions and dynamically align individual incentives with collective welfare. To resolve this issue, this paper introduces a novel intelligent decision-making framework called Local Mean-Field Proximal Policy Optimization with Unbalanced Punishment (LMFPPO-UBP). The conventional mean field concept is reformulated as a socio-statistical sensor embedded directly into the policy gradient space of deep reinforcement learning, allowing agents to adapt their strategies based on mesoscale neighborhood dynamics. Additionally, an unbalanced punishment mechanism is integrated to penalize defectors proportionally to the local density of cooperators, thereby reshaping the payoff structures without imposing direct costs on cooperative agents. Experimental results demonstrate that the LMFPPO-UBP promotes rapid and stable global cooperation even under low enhancement factors, consistently outperforming baseline methods such as Q-learning and Fermi update rules. Statistical analyses further validate the framework's effectiveness in lowering the cooperation threshold and achieving better coordinated outcomes.
Problem

Research questions and friction points this paper is trying to address.

Spatial Public Goods Games
Cooperation
High-dimensional State Spaces
Localized Externalities
Collective Welfare
Innovation

Methods, ideas, or system contributions that make the work stand out.

Local Mean Field
Proximal Policy Optimization
Unbalanced Punishment
Spatial Public Goods Games
Deep Reinforcement Learning
🔎 Similar Papers
No similar papers found.
J
Jinshuo Yang
State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, 550025, Guizhou, China; Institute of Cryptography and Data Security, Guizhou University, Guiyang, 550025, Guizhou, China
Z
Zhaoqilin Yang
State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, 550025, Guizhou, China; Institute of Cryptography and Data Security, Guizhou University, Guiyang, 550025, Guizhou, China
W
Wenjie Zhou
State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, 550025, Guizhou, China; Institute of Cryptography and Data Security, Guizhou University, Guiyang, 550025, Guizhou, China
Xin Wang
Xin Wang
China Agricultural University
MechatronicsAutomationSensorsRobotics
Y
Youliang Tian
State Key Laboratory of Public Big Data, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, Guizhou, China; Institute of Cryptography and Data Security, Guizhou University, Guiyang, 550025, Guizhou, China