🤖 AI Summary
Reinforcement learning (RL)-based network load balancing suffers from poor interpretability (“black-box” policies) and intractable controller equations, hindering verification and deployment.
Method: This paper proposes the first integration of Kolmogorov–Arnold Networks (KANs) into the Proximal Policy Optimization (PPO) framework, designing a structured, interpretable Actor network (a single-layer KAN) alongside a standard MLP-based Critic. A multi-objective reward function jointly optimizes throughput utility, packet loss rate, and end-to-end latency.
Contribution/Results: The approach significantly improves throughput, reduces packet loss, and lowers latency across diverse network settings. Crucially, the KAN-based Actor enables direct analytical extraction of compact, physically meaningful, explicit load-balancing equations from the trained policy—overcoming the fundamental limitations of uninterpretable RL controllers and non-extractable policies. This establishes a novel paradigm for verifiable, deployable intelligent network control.
📝 Abstract
Reinforcement learning (RL) has been increasingly applied to network control problems, such as load balancing. However, existing RL approaches often suffer from lack of interpretability and difficulty in extracting controller equations. In this paper, we propose the use of Kolmogorov-Arnold Networks (KAN) for interpretable RL in network control. We employ a PPO agent with a 1-layer actor KAN model and an MLP Critic network to learn load balancing policies that maximise throughput utility, minimize loss as well as delay. Our approach allows us to extract controller equations from the learned neural networks, providing insights into the decision-making process. We evaluate our approach using different reward functions demonstrating its effectiveness in improving network performance while providing interpretable policies.