🤖 AI Summary
This paper addresses multivariate polynomial functional decomposition—a computationally NP-hard algebraic problem—by proposing a Transformer-based symbolic reasoning framework. To model high-order nonlinear latent patterns, we design a synthetic data generation pipeline integrating supervised learning with Beam Grouping Relative Policy Optimization (BGRPO), a novel ranking-aware reinforcement learning algorithm that significantly improves search efficiency and generalization. Our method achieves high accuracy while reducing inference computational overhead by approximately 75%, outperforming Mathematica on polynomial decomposition and demonstrating superior generalization on downstream tasks such as polynomial simplification. The core contribution is the first application of BGRPO to symbolic computation, enabling scalable, low-overhead, and high-precision discovery of polynomial structure.
📝 Abstract
Recent efforts have extended the capabilities of transformers in logical reasoning and symbolic computations. In this work, we investigate their capacity for non-linear latent pattern discovery in the context of functional decomposition, focusing on the challenging algebraic task of multivariate polynomial decomposition. This problem, with widespread applications in science and engineering, is proved to be NP-hard, and demands both precision and insight. Our contributions are threefold: First, we develop a synthetic data generation pipeline providing fine-grained control over problem complexity. Second, we train transformer models via supervised learning and evaluate them across four key dimensions involving scaling behavior and generalizability. Third, we propose Beam Grouped Relative Policy Optimization (BGRPO), a rank-aware reinforcement learning method suitable for hard algebraic problems. Finetuning with BGRPO improves accuracy while reducing beam width by up to half, resulting in approximately 75% lower inference compute. Additionally, our model demonstrates competitive performance in polynomial simplification, outperforming Mathematica in various cases.