AmbiSQL: Interactive Ambiguity Detection and Resolution for Text-to-SQL

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In text-to-SQL translation, natural language ambiguity frequently causes large language models (LLMs) to misinterpret user intent, severely degrading SQL generation accuracy. To address this, we propose an interactive ambiguity detection and resolution framework. First, we introduce the first fine-grained ambiguity taxonomy jointly modeling database element mapping and LLM inference behavior. Second, we design an LLM-based automated ambiguity identification algorithm and employ multi-turn multiple-choice prompts to elicit user clarifications on ambiguous terms. Third, we dynamically rewrite the natural language query by integrating user feedback, seamlessly embedding the process into existing text-to-SQL pipelines. Evaluated on a dedicated ambiguity benchmark, our method achieves 87.2% precision in ambiguity detection and improves end-to-end exact-match SQL accuracy by 50%, significantly enhancing system robustness and practical interactivity.

Technology Category

Application Category

📝 Abstract
Text-to-SQL systems translate natural language questions into SQL queries, providing substantial value for non-expert users. While large language models (LLMs) show promising results for this task, they remain error-prone. Query ambiguity has been recognized as a major obstacle for LLM-based Text-to-SQL systems, leading to misinterpretation of user intent and inaccurate SQL generation. We demonstrate AmbiSQL, an interactive system that automatically detects query ambiguities and guides users through intuitive multiple-choice questions to clarify their intent. Our approach introduces a fine-grained ambiguity taxonomy for identifying ambiguities that affect database element mapping and LLM reasoning, then incorporates user feedback to rewrite ambiguous questions. Evaluation on an ambiguous query dataset shows that AmbiSQL achieves 87.2% precision in ambiguity detection and improves SQL exact match accuracy by 50% when integrated with Text-to-SQL systems. Our demonstration showcases the significant performance gains and highlights the system's practical usability. Code repo and demonstration are available at: https://github.com/JustinzjDing/AmbiSQL.
Problem

Research questions and friction points this paper is trying to address.

Detecting query ambiguities in Text-to-SQL systems
Resolving ambiguity through interactive user clarification
Improving SQL generation accuracy via intent disambiguation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interactive ambiguity detection and resolution system
Fine-grained ambiguity taxonomy for database mapping
User feedback integration to rewrite ambiguous questions
🔎 Similar Papers
No similar papers found.
Z
Zhongjun Ding
Alibaba Group
Y
Yin Lin
Alibaba Group
Tianjing Zeng
Tianjing Zeng
Alibaba Group
database