π€ AI Summary
To address the low proof success rate and high sample complexity of small language models (SLMs) in automated theorem proving, this paper proposes a multi-module agent framework that synergistically integrates an informal-reasoning large language model, a formal-proving SLM, and the Lean proof assistant. The framework employs closed-loop feedback to dynamically generate critical auxiliary lemmas, thereby guiding efficient proof-strategy search. Its key innovation lies in using an SLM as the core reasoning engine for lemma-driven proving, significantly reducing sampling requirements. Evaluated on the MiniF2F benchmark, the approach achieves an 86.1% proof success rateβthe highest reported for SLM-based methods. Ablation studies and case analyses confirm that auxiliary lemma generation is the central mechanism driving both search efficiency and proof success.
π Abstract
We present Prover Agent, a novel AI agent for automated theorem proving that integrates large language models (LLMs) with a formal proof assistant, Lean. Prover Agent coordinates an informal reasoning LLM, a formal prover model, and feedback from Lean while also generating auxiliary lemmas to assist in discovering the overall proof strategy. It achieves an 86.1% success rate on the MiniF2F benchmark, establishing a new state-of-the-art among methods using small language models (SLMs) with a much lower sample budget than previous approaches. We also present case studies illustrating how these generated lemmas contribute to solving challenging problems.