🤖 AI Summary
This work addresses zero-shot tabular question answering (TQA) by proposing a fine-tuning-free, large language model (LLM)-driven code generation approach. Given an input question, the method automatically synthesizes executable Python code to retrieve answers from structured tables. To enhance robustness, it employs a three-stage modular pipeline: (1) column importance identification and data type analysis to improve semantic understanding; (2) initial code generation guided by syntactic parsing and structured prompting; and (3) error-feedback-driven iterative code reconstruction. Crucially, the method requires no parameter fine-tuning, thereby preserving LLM generalization while substantially improving code correctness and cross-domain adaptability in zero-shot settings. Evaluated on SemEval 2025 Task 8, it ranks 33rd among 53 participating teams—marking the first empirical validation of a purely zero-shot code-generation paradigm for complex TQA tasks, demonstrating both feasibility and effectiveness without task-specific adaptation.
📝 Abstract
This paper describes our participation in SemEval 2025 Task 8, focused on Tabular Question Answering. We developed a zero-shot pipeline that leverages an Large Language Model to generate functional code capable of extracting the relevant information from tabular data based on an input question. Our approach consists of a modular pipeline where the main code generator module is supported by additional components that identify the most relevant columns and analyze their data types to improve extraction accuracy. In the event that the generated code fails, an iterative refinement process is triggered, incorporating the error feedback into a new generation prompt to enhance robustness. Our results show that zero-shot code generation is a valid approach for Tabular QA, achieving rank 33 of 53 in the test phase despite the lack of task-specific fine-tuning.