🤖 AI Summary
Current LLM-based code generation approaches struggle to ensure functional correctness and reliability, lacking systematic requirements modeling and formal verification mechanisms. This paper proposes ReDeFo—the first multi-agent code generation framework that integrates requirements engineering with formal methods. Our approach establishes an end-to-end closed loop: “requirements understanding → formal modeling → code generation → logical verification,” uniquely embedding formal specifications (e.g., temporal logic properties) into the multi-agent collaboration workflow. By tightly coupling LLM agents with formal reasoning and static analysis modules, ReDeFo bridges the semantic gap and guarantees correctness throughout the generation process. Experimental results demonstrate that ReDeFo significantly improves functional accuracy of generated code, effectively detects logical flaws, and verifies critical safety properties—advancing the state of trustworthy, automated code generation.
📝 Abstract
Automated code generation has long been considered the holy grail of software engineering. The emergence of Large Language Models (LLMs) has catalyzed a revolutionary breakthrough in this area. However, existing methods that only rely on LLMs remain inadequate in the quality of generated code, offering no guarantees of satisfying practical requirements. They lack a systematic strategy for requirements development and modeling. Recently, LLM-based agents typically possess powerful abilities and play an essential role in facilitating the alignment of LLM outputs with user requirements. In this paper, we envision the first multi-agent framework for reliable code generation based on extsc{re}quirements extsc{de}velopment and extsc{fo}rmalization, named extsc{ReDeFo}. This framework incorporates three agents, highlighting their augmentation with knowledge and techniques of formal methods, into the requirements-to-code generation pipeline to strengthen quality assurance. The core of extsc{ReDeFo} is the use of formal specifications to bridge the gap between potentially ambiguous natural language requirements and precise executable code. extsc{ReDeFo} enables rigorous reasoning about correctness, uncovering hidden bugs, and enforcing critical properties throughout the development process. In general, our framework aims to take a promising step toward realizing the long-standing vision of reliable, auto-generated software.