Lightweight Transformers for Zero-Shot and Fine-Tuned Text-to-SQL Generation Using Spider

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

128K/year

🤖 AI Summary

This study addresses text-to-SQL translation under resource-constrained settings, systematically evaluating three lightweight Transformer models—T5-Small, BART-Small, and GPT-2—on the Spider benchmark under both zero-shot and fine-tuned paradigms. We propose a model-agnostic lightweight processing pipeline, comprising a unified database schema formatting method and a schema-aware generation strategy tailored for encoder-decoder architectures. Performance is comprehensively assessed using logical form accuracy (LF Acc), BLEU, and exact match metrics. Experimental results show that fine-tuned T5-Small achieves 27.8% LF Acc, substantially outperforming BART-Small and GPT-2. The findings demonstrate the feasibility of deploying efficient, reusable NL2SQL systems using lightweight encoder-decoder models in low-resource environments. This work provides a practical technical pathway for NL2SQL deployment in education and business intelligence applications where computational efficiency and minimal infrastructure overhead are critical.

Technology Category

Application Category

📝 Abstract

Text-to-SQL translation enables non-expert users to query relational databases using natural language, with applications in education and business intelligence. This study evaluates three lightweight transformer models - T5-Small, BART-Small, and GPT-2 - on the Spider dataset, focusing on low-resource settings. We developed a reusable, model-agnostic pipeline that tailors schema formatting to each model's architecture, training them across 1000 to 5000 iterations and evaluating on 1000 test samples using Logical Form Accuracy (LFAcc), BLEU, and Exact Match (EM) metrics. Fine-tuned T5-Small achieves the highest LFAcc (27.8%), outperforming BART-Small (23.98%) and GPT-2 (20.1%), highlighting encoder-decoder models' superiority in schema-aware SQL generation. Despite resource constraints limiting performance, our pipeline's modularity supports future enhancements, such as advanced schema linking or alternative base models. This work underscores the potential of compact transformers for accessible text-to-SQL solutions in resource-scarce environments.

Problem

Research questions and friction points this paper is trying to address.

Evaluating lightweight transformers for Text-to-SQL generation

Developing a reusable pipeline for schema-aware SQL translation

Assessing model performance in low-resource settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight transformers for text-to-SQL

Model-agnostic pipeline for schema formatting

Fine-tuned T5-Small achieves highest accuracy

🔎 Similar Papers

A Survey on Employing Large Language Models for Text-to-SQL Tasks