DeKeyNLU: Enhancing Natural Language to SQL Generation through Task Decomposition and Keyword Extraction

📅 2025-09-17

📈 Citations: 0

✨ Influential: 0

career value

127K/year

🤖 AI Summary

In NL2SQL, large language models (LLMs) suffer from two key bottlenecks: overly coarse-grained task decomposition and inaccurate identification of domain-specific keywords, leading to high SQL generation error rates. Moreover, existing benchmarks lack fine-grained task segmentation and explicit keyword annotations, hindering model interpretability and performance. To address these issues, we propose DeKeyNLU—a high-quality dataset featuring explicit hierarchical task decomposition and domain keyword labeling—and DeKeySQL, an end-to-end pipeline comprising three modules: question understanding, entity retrieval, and SQL generation. DeKeySQL integrates retrieval-augmented generation (RAG) with chain-of-thought (CoT) reasoning to enhance semantic grounding. Evaluated on BIRD and Spider, our approach achieves +6.79% and +4.5% absolute improvements in execution accuracy, respectively, effectively mitigating over-decomposition and keyword omission. This work establishes a more interpretable and scalable paradigm for semantic understanding in NL2SQL.

Technology Category

Application Category

📝 Abstract

Natural Language to SQL (NL2SQL) provides a new model-centric paradigm that simplifies database access for non-technical users by converting natural language queries into SQL commands. Recent advancements, particularly those integrating Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) reasoning, have made significant strides in enhancing NL2SQL performance. However, challenges such as inaccurate task decomposition and keyword extraction by LLMs remain major bottlenecks, often leading to errors in SQL generation. While existing datasets aim to mitigate these issues by fine-tuning models, they struggle with over-fragmentation of tasks and lack of domain-specific keyword annotations, limiting their effectiveness. To address these limitations, we present DeKeyNLU, a novel dataset which contains 1,500 meticulously annotated QA pairs aimed at refining task decomposition and enhancing keyword extraction precision for the RAG pipeline. Fine-tuned with DeKeyNLU, we propose DeKeySQL, a RAG-based NL2SQL pipeline that employs three distinct modules for user question understanding, entity retrieval, and generation to improve SQL generation accuracy. We benchmarked multiple model configurations within DeKeySQL RAG pipeline. Experimental results demonstrate that fine-tuning with DeKeyNLU significantly improves SQL generation accuracy on both BIRD (62.31% to 69.10%) and Spider (84.2% to 88.7%) dev datasets.

Problem

Research questions and friction points this paper is trying to address.

Addresses inaccurate task decomposition in NL2SQL generation

Improves keyword extraction precision for SQL queries

Overcomes dataset limitations with domain-specific annotations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel dataset for task decomposition refinement

RAG-based pipeline with three distinct modules

Fine-tuning enhances SQL generation accuracy

🔎 Similar Papers

A Survey on Employing Large Language Models for Text-to-SQL Tasks