EntSQL: A Benchmark for Grounding Text-to-SQL in Long-Context Enterprise Knowledge

📅 2026-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

138K/year
🤖 AI Summary
This work addresses a critical gap in existing Text-to-SQL benchmarks, which largely overlook private business knowledge—such as internal metrics, reporting standards, and organizational rules—essential in enterprise settings. To bridge this gap, we introduce EntSQL, the first enterprise-oriented Text-to-SQL benchmark constructed from authentic, lengthy corporate documents. It comprises 1,066 English–Chinese aligned examples spanning five business domains, requiring models to deeply integrate external private knowledge to generate complex SQL queries. Experimental results demonstrate that even state-of-the-art systems achieve only a 15.9% execution accuracy on English queries when provided with full documentation, underscoring the substantial challenge and necessity of this task. EntSQL thus fills a crucial void in evaluating Text-to-SQL systems for real-world enterprise applications.
📝 Abstract
Text-to-SQL enables natural language access to databases, and recent LLMs have substantially advanced its capabilities. Existing benchmarks such as Spider, BIRD, and Spider~2.0 evaluate schema generalization, large-scale databases, and realistic workflows, but largely overlook enterprise scenarios where SQL generation depends on private business knowledge, such as internal metrics, reporting conventions, and organizational rules. We introduce EntSQL, an enterprise-oriented Text-to-SQL benchmark for evaluating long-context grounding over proprietary business documents. EntSQL contains 1,066 aligned Chinese-English semantic examples across five business domains, with most examples requiring domain knowledge beyond the question and schema and involving complex SQL structures. On English inputs, the best evaluated system reaches only 15.9\% when long-form documents are provided, highlighting the difficulty of grounding SQL generation in enterprise knowledge.
Problem

Research questions and friction points this paper is trying to address.

Text-to-SQL
enterprise knowledge
long-context grounding
business semantics
proprietary documents
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text-to-SQL
enterprise knowledge
long-context grounding
benchmark
domain-specific SQL