Play by the Type Rules: Inferring Constraints for LLM Functions in Declarative Programs

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the dual challenges of type compliance and database consistency when integrating large language model (LLM) operators into declarative query languages. We propose a lightweight, end-to-end, type-aware execution framework. Methodologically, we (1) design a type constraint inference mechanism that automatically derives output type and value-domain constraints for LLM functions; (2) employ a small, domain-specialized language model as a verifiable function executor, enabling single-pass forward inference over heterogeneous data sources; and (3) eliminate reliance on traditional multi-turn LLM post-processing. Evaluated on multi-hop question answering, our approach improves accuracy by 7% and reduces inference latency by 53% compared to baseline methods. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Integrating LLM powered operators in declarative query languages allows for the combination of cheap and interpretable functions with powerful, generalizable language model reasoning. However, in order to benefit from the optimized execution of a database query language like SQL, generated outputs must align with the rules enforced by both type checkers and database contents. Current approaches address this challenge with orchestrations consisting of many LLM-based post-processing calls to ensure alignment between generated outputs and database values, introducing performance bottlenecks. We perform a study on the ability of various sized open-source language models to both parse and execute functions within a query language based on SQL, showing that small language models can excel as function executors over hybrid data sources. Then, we propose an efficient solution to enforce the well-typedness of LLM functions, demonstrating 7% accuracy improvement on a multi-hop question answering dataset with 53% improvement in latency over comparable solutions. We make our implementation available at https://github.com/parkervg/blendsql
Problem

Research questions and friction points this paper is trying to address.

Enforcing type alignment between LLM outputs and database constraints
Reducing performance bottlenecks from LLM-based post-processing calls
Improving accuracy and latency for LLM functions in SQL queries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Small language models execute SQL functions efficiently
Enforce well-typedness of LLM functions in queries
Improve accuracy and latency in question answering
🔎 Similar Papers