Lightweight Transformers for Zero-Shot and Fine-Tuned Text-to-SQL Generation Using Spider

πŸ“… 2025-08-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses text-to-SQL translation under resource-constrained settings, systematically evaluating three lightweight Transformer modelsβ€”T5-Small, BART-Small, and GPT-2β€”on the Spider benchmark under both zero-shot and fine-tuned paradigms. We propose a model-agnostic lightweight processing pipeline, comprising a unified database schema formatting method and a schema-aware generation strategy tailored for encoder-decoder architectures. Performance is comprehensively assessed using logical form accuracy (LF Acc), BLEU, and exact match metrics. Experimental results show that fine-tuned T5-Small achieves 27.8% LF Acc, substantially outperforming BART-Small and GPT-2. The findings demonstrate the feasibility of deploying efficient, reusable NL2SQL systems using lightweight encoder-decoder models in low-resource environments. This work provides a practical technical pathway for NL2SQL deployment in education and business intelligence applications where computational efficiency and minimal infrastructure overhead are critical.

Technology Category

Application Category

πŸ“ Abstract
Text-to-SQL translation enables non-expert users to query relational databases using natural language, with applications in education and business intelligence. This study evaluates three lightweight transformer models - T5-Small, BART-Small, and GPT-2 - on the Spider dataset, focusing on low-resource settings. We developed a reusable, model-agnostic pipeline that tailors schema formatting to each model's architecture, training them across 1000 to 5000 iterations and evaluating on 1000 test samples using Logical Form Accuracy (LFAcc), BLEU, and Exact Match (EM) metrics. Fine-tuned T5-Small achieves the highest LFAcc (27.8%), outperforming BART-Small (23.98%) and GPT-2 (20.1%), highlighting encoder-decoder models' superiority in schema-aware SQL generation. Despite resource constraints limiting performance, our pipeline's modularity supports future enhancements, such as advanced schema linking or alternative base models. This work underscores the potential of compact transformers for accessible text-to-SQL solutions in resource-scarce environments.
Problem

Research questions and friction points this paper is trying to address.

Evaluating lightweight transformers for Text-to-SQL generation
Developing a reusable pipeline for schema-aware SQL translation
Assessing model performance in low-resource settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight transformers for text-to-SQL
Model-agnostic pipeline for schema formatting
Fine-tuned T5-Small achieves highest accuracy
πŸ”Ž Similar Papers
No similar papers found.
C
Chirag Seth
University of Waterloo, Waterloo, ON, Canada
Utkarsh Singh
Utkarsh Singh
Unknown affiliation