🤖 AI Summary
Large language models (LLMs) generate fluent yet unverifiable outputs, lacking formal guarantees of correctness or consistency.
Method: This paper proposes a neuro-symbolic framework inspired by Design by Contract (DbC), introducing semantic and type contracts to constrain LLM input-output behavior, augmented by a probabilistic repair mechanism for controllable generation guidance. It posits a “functional equivalence hypothesis,” treating agents satisfying identical contracts as contract-equivalent—unifying trust assessment across symbolic parsers and black-box components. Structured programmer-defined verification conditions, type theory, and DbC principles are jointly leveraged to model end-to-end verifiable constraints on LLM invocations.
Contribution/Results: This work establishes the first end-to-end verifiable constraint framework for LLM calls. Experiments demonstrate significant improvements in output reliability, consistency, and verifiability, providing a practical, formally grounded design paradigm for trustworthy AI agents.
📝 Abstract
Generative models, particularly Large Language Models (LLMs), produce fluent outputs yet lack verifiable guarantees. We adapt Design by Contract (DbC) and type-theoretic principles to introduce a contract layer that mediates every LLM call. Contracts stipulate semantic and type requirements on inputs and outputs, coupled with probabilistic remediation to steer generation toward compliance. The layer exposes the dual view of LLMs as semantic parsers and probabilistic black-box components. Contract satisfaction is probabilistic and semantic validation is operationally defined through programmer-specified conditions on well-typed data structures. More broadly, this work postulates that any two agents satisfying the same contracts are emph{functionally equivalent} with respect to those contracts.