Hierarchical Support Vector State Partitioning for Distilling Black Box Reinforcement Learning Policies

📅 2026-05-05
📈 Citations: 0
Influential: 0
📄 PDF

career value

212K/year
📝 Abstract
We introduce State Vector Space Partitioning (SVSP), a novel method to mimic a black box reinforcement learning policy using a set of human-interpretable subpolicies. By partitioning a distillation dataset of state action pairs with linear support vector machine splits, SVSP constructs a compact and structured representation of the original policy. Our method improves mean return by +7.4\% over previous critic driven state partitioning attempts such as Voronoi State Partitioning (VSP) and +2.8\% over the original TD3 policy, while reducing the number of required subpolicies against VSP by 82.1\%. Our results pave the path towards a more flexible form of distillation where both the decision boundary and surrogate models can be chosen within a margin of the original black box behavior.
Problem

Research questions and friction points this paper is trying to address.

black box reinforcement learning
policy distillation
interpretable subpolicies
state partitioning
support vector machine
Innovation

Methods, ideas, or system contributions that make the work stand out.

State Vector Space Partitioning
Support Vector Machine
Policy Distillation
Interpretable Subpolicies
Reinforcement Learning
🔎 Similar Papers
No similar papers found.