🤖 AI Summary
To address policy homogenization induced by parameter sharing in multi-agent reinforcement learning—particularly its inability to accommodate heterogeneous agent identities and task requirements—this paper proposes a zero-overhead, identity-driven adaptive subnet partitioning mechanism. Inspired by neural functional parcellation, the method employs a learnable identity encoder to generate agent-specific binary masks, which dynamically route inputs to localized subnetworks within a shared backbone, thereby enabling differentiated policy representations. Crucially, it introduces no additional parameters and is fully compatible with standard on-policy algorithms such as PPO and A2C. Extensive experiments on StarCraft II, the Multi-Agent Particle Environment (MPE), and custom heterogeneous benchmarks demonstrate an average 12.7% improvement in win rate and a 3.2× increase in inter-agent policy diversity, significantly outperforming both conventional parameter sharing and Hypernetwork-based baselines.
📝 Abstract
Parameter sharing, as an important technique in multi-agent systems, can effectively solve the scalability issue in large-scale agent problems. However, the effectiveness of parameter sharing largely depends on the environment setting. When agents have different identities or tasks, naive parameter sharing makes it difficult to generate sufficiently differentiated strategies for agents. Inspired by research pertaining to the brain in biology, we propose a novel parameter sharing method. It maps each type of agent to different regions within a shared network based on their identity, resulting in distinct subnetworks. Therefore, our method can increase the diversity of strategies among different agents without introducing additional training parameters. Through experiments conducted in multiple environments, our method has shown better performance than other parameter sharing methods.