🤖 AI Summary
Automated proving of Olympiad inequalities faces challenges from an exponentially large strategy space and scarcity of formalized training data. Method: This paper proposes a neuro-symbolic collaborative framework wherein a large language model (LLM) performs intuitive algebraic rewriting, while a symbolic engine handles decidable scaling and normalization transformations; a joint neuro-symbolic scheduler dynamically prunes and ranks the goal space. Contribution/Results: The framework introduces, for the first time, a tactic allocation mechanism inspired by human problem-solving cognition—partitioning reasoning responsibilities between neural and symbolic components—without requiring additional fine-tuning. It achieves state-of-the-art performance on 161 international mathematical competition inequality problems, significantly outperforming both pure-LLM and pure-symbolic baselines.
📝 Abstract
Large language models (LLMs) can prove mathematical theorems formally by generating proof steps ( extit{a.k.a.} tactics) within a proof system. However, the space of possible tactics is vast and complex, while the available training data for formal proofs is limited, posing a significant challenge to LLM-based tactic generation. To address this, we introduce a neuro-symbolic tactic generator that synergizes the mathematical intuition learned by LLMs with domain-specific insights encoded by symbolic methods. The key aspect of this integration is identifying which parts of mathematical reasoning are best suited to LLMs and which to symbolic methods. While the high-level idea of neuro-symbolic integration is broadly applicable to various mathematical problems, in this paper, we focus specifically on Olympiad inequalities (Figure~1). We analyze how humans solve these problems and distill the techniques into two types of tactics: (1) scaling, handled by symbolic methods, and (2) rewriting, handled by LLMs. In addition, we combine symbolic tools with LLMs to prune and rank the proof goals for efficient proof search. We evaluate our framework on 161 challenging inequalities from multiple mathematics competitions, achieving state-of-the-art performance and significantly outperforming existing LLM and symbolic approaches without requiring additional training data.