🤖 AI Summary
Existing interpretability methods primarily target centralized multi-agent reinforcement learning (MARL), rendering them inadequate for decentralized settings characterized by uncertainty and non-stationarity. This paper introduces the first systematic interpretability framework specifically designed for decentralized MARL, integrating policy decomposition with causal analysis to support three fundamental user queries: “When?”, “Why Not?”, and “What?”. The framework automatically generates policy summaries that capture task temporal structure and inter-agent collaboration dynamics. It is algorithm-agnostic and validated across four representative domains using two mainstream decentralized MARL algorithms—MADDPG and QMIX—demonstrating broad applicability and effectiveness. User studies show significant improvements in policy comprehension (+32%), query-answer accuracy (+28%), and user satisfaction (+41%). Overall, the framework establishes a scalable, interactive interpretability paradigm for decentralized multi-agent decision-making.
📝 Abstract
Multi-Agent Reinforcement Learning (MARL) has gained significant interest in recent years, enabling sequential decision-making across multiple agents in various domains. However, most existing explanation methods focus on centralized MARL, failing to address the uncertainty and nondeterminism inherent in decentralized settings. We propose methods to generate policy summarizations that capture task ordering and agent cooperation in decentralized MARL policies, along with query-based explanations for When, Why Not, and What types of user queries about specific agent behaviors. We evaluate our approach across four MARL domains and two decentralized MARL algorithms, demonstrating its generalizability and computational efficiency. User studies show that our summarizations and explanations significantly improve user question-answering performance and enhance subjective ratings on metrics such as understanding and satisfaction.