Neuro-Symbolic Generation and Validation of Memory-Aware Formal Function Specifications

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a neurosymbolic framework to enable scalable formal verification of C programs generated by large language models (LLMs). The approach automatically synthesizes memory-aware formal function specifications from natural language descriptions and function signatures, focusing on specification generation rather than complex loop invariants. It integrates in-context learning from LLMs with symbolic reasoning, compiler diagnostics, and a formal verification toolchain, and introduces a machine-checkable refinement mechanism based on counterexamples and symbolic refutation to iteratively improve specifications. Experiments on a newly constructed LeetCode-C-Spec benchmark demonstrate that iterative refinement significantly enhances the syntactic validity of generated specifications, while the symbolic refutation mechanism substantially improves the accuracy of correctness judgments.

Technology Category

Application Category

📝 Abstract
Formal verification of memory-manipulating programs critically depends on precise function specifications that capture memory states written by experts. This requirement has become a major bottleneck as large language models (LLMs) increasingly generate low-level systems code whose correctness cannot be assumed. To enable scalable formal verification, we focus exclusively on function specification generation, deliberately avoiding the synthesis of complex loop invariants that are central to traditional verification pipelines. We propose a neuro-symbolic framework for automatically generating memory-aware formal function specifications for C programs from natural language problem descriptions and function signatures. The pipeline first produces candidate specifications via in-context learning, and then iteratively refines them using compiler diagnostics from symbolic provers and the verification toolchain. In particular, we validate candidate specifications by constructing a proof for the negation of the specification with concrete examples, enabling machine-checked rejection of plausible-but-incorrect specifications. To support systematic evaluation, we introduce LeetCode-C-Spec, a new benchmark of 200 C programming problems for generating memory-aware formal function specifications. Experiments show that iterative refinement substantially improves syntactic validity, while symbolic prover-based refutation significantly enhances correctness assessment by filtering false positives that LLM-only judges frequently accept. Our results demonstrate that combining neural generation with symbolic feedback provides an effective approach to formal specification synthesis for memory-safe systems software.
Problem

Research questions and friction points this paper is trying to address.

formal specification
memory-aware
function specification generation
formal verification
LLM-generated code
Innovation

Methods, ideas, or system contributions that make the work stand out.

neuro-symbolic
memory-aware specifications
formal verification
iterative refinement
specification synthesis
🔎 Similar Papers
No similar papers found.
L
Liao Zhang
School of Computer Science, Shanghai Jiao Tong University
T
Tong Chen
Shanghai Innovation Institute
Xiwei Wu
Xiwei Wu
Professor, City of Hope
GenomicsBioinformaticsCancer BiomarkermiRNA
Q
Qi Liu
Shanghai Innovation Institute
Xiyu Zhai
Xiyu Zhai
Unknown affiliation
X
Xinqi Wang
School of Computer Science and Engineering, University of Washington
Q
Qinxiang Cao
School of Artificial Intelligence, Shanghai Jiao Tong University