Problem
Research questions and friction points this paper is trying to address.
Lack of stability guarantees in reinforcement learning
Sample inefficiency in on-policy Lyapunov function learning
Need for data-efficient stability certificates in RL algorithms
Innovation
Methods, ideas, or system contributions that make the work stand out.
Off-policy Lyapunov function learning
Integration with SAC and PPO
Data efficient stability certificates