Loop Invariant Generation: A Hybrid Framework of Reasoning optimised LLMs and SMT Solvers

📅 2025-08-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the automated synthesis of loop invariants for formal verification of looping programs. We propose a generate-and-verify closed-loop framework that tightly integrates large language models (LLMs)—specifically reasoning-optimized variants such as OpenAI O1, O1-mini, and O3-mini—with the Z3 SMT solver. Invariant synthesis proceeds via counterexample-guided iterative refinement, where verification feedback directly drives LLM inference optimization. To our knowledge, this is the first approach achieving deep, symbol-level synergy between LLMs and SMT solvers in formal verification. Evaluated on the Code2Inv benchmark (133 benchmarks), our method achieves 100% coverage—surpassing the prior state-of-the-art (107/133)—with an average of only 1–2 LLM invocations and runtime of 14–55 seconds per benchmark. The results demonstrate substantial improvements in precision, efficiency, and generalization, validating the substantive potential of LLMs in formal deductive reasoning.

Technology Category

Application Category

📝 Abstract
Loop invariants are essential for proving the correctness of programs with loops. Developing loop invariants is challenging, and fully automatic synthesis cannot be guaranteed for arbitrary programs. Some approaches have been proposed to synthesize loop invariants using symbolic techniques and more recently using neural approaches. These approaches are able to correctly synthesize loop invariants only for subsets of standard benchmarks. In this work, we investigate whether modern, reasoning-optimized large language models can do better. We integrate OpenAI's O1, O1-mini, and O3-mini into a tightly coupled generate-and-check pipeline with the Z3 SMT solver, using solver counterexamples to iteratively guide invariant refinement. We use Code2Inv benchmark, which provides C programs along with their formal preconditions and postconditions. On this benchmark of 133 tasks, our framework achieves 100% coverage (133 out of 133), outperforming the previous best of 107 out of 133, while requiring only 1-2 model proposals per instance and 14-55 seconds of wall-clock time. These results demonstrate that LLMs possess latent logical reasoning capabilities which can help automate loop invariant synthesis. While our experiments target C-specific programs, this approach should be generalizable to other imperative languages.
Problem

Research questions and friction points this paper is trying to address.

Automating loop invariant synthesis for program correctness
Combining LLMs and SMT solvers for invariant refinement
Improving coverage and efficiency in invariant generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework combining LLMs and SMT solvers
Iterative refinement using solver counterexamples
Achieves 100% coverage on Code2Inv benchmark
🔎 Similar Papers
No similar papers found.
V
Varun Bharti
IIIT Delhi, Delhi, India
S
Shashwat Jha
IIIT Delhi, Delhi, India
D
Dhruv Kumar
BITS Pilani, Pilani, India
Pankaj Jalote
Pankaj Jalote
IIIT-Delhi
Software Engineering