RocqSmith: Can Automatic Optimization Forge Better Proof Agents?

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates the effectiveness of automated optimization methods in enhancing the performance of Rocq-based formal theorem-proving agents. Focusing on key components such as prompt design, contextual knowledge, and control strategies, we apply a range of automated optimizers—including few-shot prompting—to perform end-to-end tuning of proof-generation agents. To our knowledge, this work constitutes the first comprehensive empirical investigation of automated agent optimization within a real-world formal proving environment. Our results demonstrate that certain approaches, particularly few-shot prompting, yield measurable and consistent performance gains. Nevertheless, all automated optimization strategies still fall short of the current best hand-crafted agent designs, revealing a persistent performance gap between automated tuning and expert manual refinement.

Technology Category

Application Category

📝 Abstract
This work studies the applicability of automatic AI agent optimization methods to real-world agents in formal verification settings, focusing on automated theorem proving in Rocq as a representative and challenging domain. We evaluate how different automatic agent optimizers perform when applied to the task of optimizing a Rocq proof-generation agent, and assess whether parts of the fine-grained tuning of agentic systems, such as prompt design, contextual knowledge, and control strategies, can be automated. Our results show that while several optimizers yield measurable improvements, simple few-shot bootstrapping is the most consistently effective; however, none of the studied methods matches the performance of a carefully engineered state-of-the-art proof agent.
Problem

Research questions and friction points this paper is trying to address.

automatic optimization
proof agents
formal verification
automated theorem proving
Rocq
Innovation

Methods, ideas, or system contributions that make the work stand out.

automatic agent optimization
automated theorem proving
few-shot bootstrapping
formal verification
proof agents
🔎 Similar Papers
No similar papers found.
A
Andrei Kozyrev
JetBrains Research Germany
N
Nikita Khramov
JetBrains Research Germany
D
Denis Lochmelis
JetBrains Research Germany
V
Valerio Morelli
JetBrains Research Germany
G
Gleb V. Solovev
JetBrains Research Germany
Anton Podkopaev
Anton Podkopaev
JetBrains Research, Constructor University Bremen
Programming languagesfunctional programmingverificationconcurrency