๐ค AI Summary
Traditional algebraic methods for automated polynomial inequality proving are constrained by degree truncation, limiting their efficacy on high-degree or sparse polynomials.
Method: This paper proposes a reinforcement learning (RL)-based framework that reformulates inequality proving as a linear programming problem, equivalently cast as a basis-selection task over the KrivineโStengle nonnegative basis representation. It introduces Proximal Policy Optimization (PPO) to efficiently search the basis space for feasible representations and incorporates a fast Fourier transform (FFT)-accelerated multivariate polynomial multiplication to drastically speed up action evaluation.
Contribution/Results: We implement the open-source tool APPIRL, which outperforms state-of-the-art algebraic provers on standard benchmarks. Moreover, APPIRL successfully computes tight semidefinite relaxations for the maximum stable set problem, demonstrating both empirical effectiveness and generalization capability beyond synthetic benchmarks.
๐ Abstract
Polynomial inequality proving is fundamental to many mathematical disciplines and finds wide applications in diverse fields. Current traditional algebraic methods are based on searching for a polynomial positive definite representation over a set of basis. However, these methods are limited by truncation degree. To address this issue, this paper proposes an approach based on reinforcement learning to find a {Krivine-basis} representation for proving polynomial inequalities. Specifically, we formulate the inequality proving problem as a linear programming (LP) problem and encode it as a basis selection problem using reinforcement learning (RL), achieving a non-negative {Krivine basis}. Moreover, a fast multivariate polynomial multiplication method based on Fast Fourier Transform (FFT) is employed to enhance the efficiency of action space search. Furthermore, we have implemented a tool called {APPIRL} (Automated Proof of Polynomial Inequalities via Reinforcement Learning). Experimental evaluation on benchmark problems demonstrates the feasibility and effectiveness of our approach. In addition, {APPIRL} has been successfully applied to solve the maximum stable set problem.