SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In text-to-SQL, self-play fine-tuning methods (e.g., SPIN) suffer from limited improvement of the main model due to insufficient information gain and excessive generation of correct SQL by the opponent model. To address this, we propose SPFT-SQL, a novel self-play framework featuring: (1) verification-driven iterative data construction—leveraging SQL execution feedback to select high-quality synthetic examples; and (2) an error-directed loss mechanism—explicitly encouraging the opponent model to generate discriminative erroneous SQL, thereby enhancing the main model’s ability to detect and correct semantic-structural mismatches. Extensive experiments across six open-source large language models and five mainstream benchmarks demonstrate that SPFT-SQL consistently outperforms existing state-of-the-art methods, achieving average accuracy gains of 3.2–7.8 percentage points.

Technology Category

Application Category

📝 Abstract

Despite the significant advancements of self-play fine-tuning (SPIN), which can transform a weak large language model (LLM) into a strong one through competitive interactions between models of varying capabilities, it still faces challenges in the Text-to-SQL task. SPIN does not generate new information, and the large number of correct SQL queries produced by the opponent model during self-play reduces the main model's ability to generate accurate SQL queries. To address this challenge, we propose a new self-play fine-tuning method tailored for the Text-to-SQL task, called SPFT-SQL. Prior to self-play, we introduce a verification-based iterative fine-tuning approach, which synthesizes high-quality fine-tuning data iteratively based on the database schema and validation feedback to enhance model performance, while building a model base with varying capabilities. During the self-play fine-tuning phase, we propose an error-driven loss method that incentivizes incorrect outputs from the opponent model, enabling the main model to distinguish between correct SQL and erroneous SQL generated by the opponent model, thereby improving its ability to generate correct SQL. Extensive experiments and in-depth analyses on six open-source LLMs and five widely used benchmarks demonstrate that our approach outperforms existing state-of-the-art (SOTA) methods.

Problem

Research questions and friction points this paper is trying to address.

Improving Text-to-SQL parsing accuracy through self-play fine-tuning

Addressing opponent model's correct SQL reducing main model performance

Enhancing LLM's ability to distinguish correct from erroneous SQL

Innovation

Methods, ideas, or system contributions that make the work stand out.

Verification-based iterative fine-tuning for data synthesis

Error-driven loss method to incentivize incorrect outputs

Self-play fine-tuning tailored for Text-to-SQL parsing

🔎 Similar Papers

No similar papers found.

Authors to Follow