Rethinking Table Instruction Tuning

📅 2025-01-24
📈 Citations: 1
Influential: 1
📄 PDF
🤖 AI Summary
Existing large language models (LLMs) for tabular data exhibit significant performance degradation on out-of-domain generalization and general-purpose capabilities; conventional fine-tuning often improves tabular performance at the expense of generality. Method: This paper systematically identifies the critical role of hyperparameters—especially learning rate—in balancing specialized and general capabilities, and proposes TAMA: a lightweight instruction-tuning paradigm that applies low-learning-rate, few-shot instruction tuning atop LLaMA-3.1-8B-Instruct to jointly enhance tabular understanding and general-purpose reasoning. Contribution/Results: TAMA matches or surpasses GPT-3.5/4 across diverse tabular tasks while preserving strong performance on general benchmarks (e.g., MMLU, BBH) and out-of-domain tabular generalization. It achieves this with substantially reduced annotation cost and training overhead, challenging the prevailing assumption that tabular fine-tuning inevitably compromises general capability.

Technology Category

Application Category

📝 Abstract
Recent advances in table understanding have focused on instruction-tuning large language models (LLMs) for table-related tasks. However, existing research has overlooked the impact of hyperparameter choices and lacks a comprehensive evaluation of the out-of-domain table understanding ability and the general capabilities of these table LLMs. In this paper, we evaluate these abilities in existing table LLMs, and reveal significant declines in both out-of-domain table understanding and general capabilities compared to their base models. Through systematic analysis, we show that hyperparameters, such as learning rate, can significantly influence both table-specific and general capabilities. Contrary to the existing table instruction-tuning works, we demonstrate that smaller learning rates and fewer training instances can enhance table understanding while preserving general capabilities. Based on our findings, we introduce TAMA, a TAble LLM instruction-tuned from LLaMA 3.1 8B Instruct, which achieves performance on par with, or surpassing GPT-3.5 and GPT-4 on table tasks, while maintaining strong out-of-domain generalization and general capabilities. Our findings highlight the potential for reduced data annotation costs and more efficient model development through careful hyperparameter selection.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Tabular Data
Parameter Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning Rate Adjustment
Reduced Training Data
TAMA Model
🔎 Similar Papers
No similar papers found.