Discovering Hidden Algebraic Structures via Transformers with Rank-Aware Beam GRPO

📅 2025-08-21

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This paper addresses multivariate polynomial functional decomposition—a computationally NP-hard algebraic problem—by proposing a Transformer-based symbolic reasoning framework. To model high-order nonlinear latent patterns, we design a synthetic data generation pipeline integrating supervised learning with Beam Grouping Relative Policy Optimization (BGRPO), a novel ranking-aware reinforcement learning algorithm that significantly improves search efficiency and generalization. Our method achieves high accuracy while reducing inference computational overhead by approximately 75%, outperforming Mathematica on polynomial decomposition and demonstrating superior generalization on downstream tasks such as polynomial simplification. The core contribution is the first application of BGRPO to symbolic computation, enabling scalable, low-overhead, and high-precision discovery of polynomial structure.

Technology Category

Application Category

📝 Abstract

Recent efforts have extended the capabilities of transformers in logical reasoning and symbolic computations. In this work, we investigate their capacity for non-linear latent pattern discovery in the context of functional decomposition, focusing on the challenging algebraic task of multivariate polynomial decomposition. This problem, with widespread applications in science and engineering, is proved to be NP-hard, and demands both precision and insight. Our contributions are threefold: First, we develop a synthetic data generation pipeline providing fine-grained control over problem complexity. Second, we train transformer models via supervised learning and evaluate them across four key dimensions involving scaling behavior and generalizability. Third, we propose Beam Grouped Relative Policy Optimization (BGRPO), a rank-aware reinforcement learning method suitable for hard algebraic problems. Finetuning with BGRPO improves accuracy while reducing beam width by up to half, resulting in approximately 75% lower inference compute. Additionally, our model demonstrates competitive performance in polynomial simplification, outperforming Mathematica in various cases.

Problem

Research questions and friction points this paper is trying to address.

Discovering hidden algebraic structures via transformers

Solving NP-hard multivariate polynomial decomposition

Improving accuracy with reduced computational inference cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Beam Grouped Relative Policy Optimization method

Rank-aware reinforcement learning for algebra

Synthetic data generation pipeline control

🔎 Similar Papers

No similar papers found.