Efficient Neural Clause-Selection Reinforcement

📅 2025-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Clause selection in saturation-based theorem proving relies heavily on hand-crafted heuristics, limiting generalizability and efficiency. Method: This paper proposes the first end-to-end approach that deeply integrates reinforcement learning (RL) into neural architecture design for automated theorem proving: using Vampire as the backend prover, it learns clause-scoring policies directly from successful proof traces on the TPTP benchmark; a lightweight neural network models state-action values to enable low-overhead online inference. Contribution/Results: Its key innovation is an RL-driven policy evolution mechanism that optimizes long-term proof success without manual feature engineering. Under short CPU time limits, the method achieves a 20% improvement in solving rate over baselines on unseen TPTP test problems, significantly enhancing both automation and robustness of the theorem prover.

Technology Category

Application Category

📝 Abstract
Clause selection is arguably the most important choice point in saturation-based theorem proving. Framing it as a reinforcement learning (RL) task is a way to challenge the human-designed heuristics of state-of-the-art provers and to instead automatically evolve -- just from prover experiences -- their potentially optimal replacement. In this work, we present a neural network architecture for scoring clauses for clause selection that is powerful yet efficient to evaluate. Following RL principles to make design decisions, we integrate the network into the Vampire theorem prover and train it from successful proof attempts. An experiment on the diverse TPTP benchmark finds the neurally guided prover improve over a baseline strategy, from which it initially learns--in terms of the number of in-training-unseen problems solved under a practically relevant, short CPU instruction limit--by 20%.
Problem

Research questions and friction points this paper is trying to address.

Automates clause selection in theorem proving using reinforcement learning.
Replaces human-designed heuristics with neural network-based scoring.
Improves theorem prover performance by 20% on unseen problems.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural network for clause scoring
Reinforcement learning in theorem proving
Integration with Vampire theorem prover
🔎 Similar Papers
No similar papers found.