π€ AI Summary
This work addresses the semantic ambiguity in natural language queries arising from usersβ lack of explicit knowledge about how database values are storedβa limitation that traditional approaches handle using incomplete, pre-built semantic layers. The paper proposes GATE, a novel method that leverages SQL execution feedback not merely for validation but to actively guide the completion of semantic mappings. By executing partial SQL queries while maintaining multiple semantic hypotheses, GATE uses execution outcomes to select and solidify correct mappings, iteratively constructing a reusable memory bank. Integrating hypothesis retention, partial execution, feedback-driven matching, and memory storage, GATE enables execution-driven, self-bootstrapping learning of the semantic layer. Experiments on both real-world and controlled benchmarks demonstrate substantial improvements over strong baselines, significantly enhancing Text-to-SQL accuracy.
π Abstract
Real-world text-to-SQL is often under-specified until user phrases are grounded in how the database stores values. Prior work attempts to address this by requiring a semantic layer to specify groundings in advance, but such specifications are often incomplete, especially in expert domains where domain-specific conventions are under-documented. As this leaves multiple grounding hypotheses open for the same SQL part, we introduce GATE (Grouding After Test from Execution), which bootstraps missing groundings from execution feedback. GATE keeps grounding hypotheses open while executing the already grounded parts to obtain observations. Then, only the hypothesis supported by that observation is grounded and stored as a memory entry, recording what was tested and how the open part should be written in SQL. These entries accumulate into execution-grounded memory, allowing later steps to reuse supported groundings. Across real-world and controlled benchmarks, GATE consistently improves over strong baselines, demonstrating that execution can serve not only as validation but also as a bootstrapping mechanism for reusable memory in text-to-SQL.