🤖 AI Summary
This study addresses the challenge of specification scarcity in deductive verification of Java programs by proposing an LLM-driven, annotation-based JML specification generation and formal verification closed-loop framework. Methodologically, it treats LLMs (e.g., CodeLlama, GPT) as unreliable oracles; task-oriented prompt engineering elicits candidate specifications, which are then subjected to automated provability checking and iterative refinement via an SMT-solver–driven verifier (OpenJML/KeY). The key contribution is the first instantiation of an LLM–verifier collaborative closed-loop paradigm, enabling deep synergy between specification synthesis and formal verification. Evaluated on standard Java benchmark suites, 87% of LLM-generated specifications pass automatic verification, with 62% achieving full provable correctness—marking a substantial improvement in both reliability and practical applicability of generated specifications.
📝 Abstract
Recent work has shown that Large Language Models (LLMs) are not only a suitable tool for code generation but also capable of generating annotation-based code specifications. Scaling these methodologies may allow us to deduce provable correctness guarantees for large-scale software systems. In comparison to other LLM tasks, the application field of deductive verification has the notable advantage of providing a rigorous toolset to check LLM-generated solutions. This short paper provides early results on how this rigorous toolset can be used to reliably elicit correct specification annotations from an unreliable LLM oracle.