🤖 AI Summary
In text-to-SQL tasks, lightweight models suffer from low accuracy on complex queries and high inference overhead. This paper proposes an execution-result-guided multi-candidate SQL filtering framework, introducing the first execution-feedback-driven candidate reranking paradigm. Leveraging a lightweight semantic consistency scoring mechanism, it reranks sampled SQL queries based on actual database execution validation—requiring no fine-tuning and enabling plug-and-play adaptation to any SQL generation model. Our method significantly improves semantic correctness and execution accuracy of small models on complex queries. It outperforms large reasoning models—including o1, o3-mini, and DeepSeek R1—across multiple standard benchmarks, while reducing inference cost by up to 30×. To our knowledge, this is the first approach to achieve simultaneous superiority in both accuracy and efficiency for lightweight models in text-to-SQL.
📝 Abstract
We propose a novel approach for generating complex outputs that significantly improves accuracy in text-to-SQL tasks. Our method leverages execution results to select the most semantically consistent query from multiple candidates, enabling smaller, cost-effective models to surpass computationally intensive reasoning methods such as o1, o3-mini, and DeepSeek R1 while reducing inference cost by as much as 30 times. It integrates effortlessly with existing models, offering a practical and scalable pathway to state-of-the-art SQL generation.