Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

πŸ“… 2025-06-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work investigates the capability of large language models (LLMs) to perform symbolic constraint analysis for worst-case program execution, aiming to bridge neural program modeling and formal symbolic reasoning. We formally define this novel task and propose a satisfiability modulo theories (SMT) solver-aligned reinforcement learning fine-tuning paradigm, integrating symbolic reasoning guidance with a custom-constructed constraint dataset to enable efficient fine-tuning of small-scale models (3B). The resulting WARP-1.0-3B model significantly outperforms both same-sized and larger baseline models across multiple symbolic constraint analysis benchmarks. Our results demonstrate that LLMs can not only participate in but also actively drive formal program analysis, establishing a new neuro-symbolic paradigm for program understanding grounded in rigorous constraint reasoning.

Technology Category

Application Category

πŸ“ Abstract
Large language models (LLMs) have been successfully applied to a variety of coding tasks, including code generation, completion, and repair. However, more complex symbolic reasoning tasks remain largely unexplored by LLMs. This paper investigates the capacity of LLMs to reason about worst-case executions in programs through symbolic constraints analysis, aiming to connect LLMs and symbolic reasoning approaches. Specifically, we define and address the problem of worst-case symbolic constraints analysis as a measure to assess the comprehension of LLMs. We evaluate the performance of existing LLMs on this novel task and further improve their capabilities through symbolic reasoning-guided fine-tuning, grounded in SMT (Satisfiability Modulo Theories) constraint solving and supported by a specially designed dataset of symbolic constraints. Experimental results show that our solver-aligned model, WARP-1.0-3B, consistently surpasses size-matched and even much larger baselines, demonstrating that a 3B LLM can recover the very constraints that pin down an algorithm's worst-case behaviour through reinforcement learning methods. These findings suggest that LLMs are capable of engaging in deeper symbolic reasoning, supporting a closer integration between neural network-based learning and formal methods for rigorous program analysis.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' comprehension via worst-case symbolic constraints analysis
Connecting LLMs with symbolic reasoning for program execution analysis
Improving LLMs' symbolic reasoning via SMT-guided fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs analyze worst-case symbolic constraints
Fine-tuning with SMT constraint solving
Reinforcement learning recovers worst-case constraints
πŸ”Ž Similar Papers
No similar papers found.