TLRD: Teaching LLMs to Reason over Tabular Data with Tri-Level Rationale Distillation

📅 2026-06-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Large language models (LLMs) exhibit limited capability in understanding and reasoning over tabular data, and label-based fine-tuning often suffers from catastrophic forgetting while lacking interpretability. To address these issues, this work proposes a three-tier rationale distillation framework that introduces, for the first time, a multi-granularity rationale generation mechanism integrating instance-specific features, dataset-level distributional patterns, and retrieved neighboring examples. Through a teacher–student architecture, these structured rationales are distilled into a lightweight LLM without incurring additional inference overhead. The approach substantially narrows the performance gap between LLMs and state-of-the-art tree-based ensemble models across diverse tabular tasks, while simultaneously producing reliable and human-readable explanations for model decisions—thereby achieving a unified balance between high performance and strong interpretability.

📝 Abstract

Tabular data is a primary medium for storing real-world information, driving many industrial applications of machine learning. Traditional predictors achieve strong predictive performance but do not provide readable, case-specific explanations essential for decision-making. Large Language Models (LLMs) can naturally bridge this gap by generating predictions alongside explanations. However, dataset-specific patterns, such as feature distributions and interactions, make tabular data difficult for LLMs to understand and reason over, while label-only fine-tuning improves performance at the cost of catastrophic forgetting. To address this problem, we propose Tri-Level Rationale Distillation (TLRD), a framework that converts label-only tabular datasets into structured rationale supervision for LLMs. TLRD uses a high-capacity teacher to synthesize a rationale corpus grounded in three complementary levels of evidence: instance-level feature, dataset-level distributional context, and comparison-level retrieved neighbors, then distills the rationale into student LLMs, enabling zero-overhead prediction and grounded explanation from raw features only. Experiments on multiple domain datasets show that TLRD significantly closes the performance gap between LLMs and state-of-the-art tree ensembles while producing grounded and readable explanations, offering a valuable reference for high-stakes decision-making.

Problem

Research questions and friction points this paper is trying to address.

tabular data

Large Language Models

reasoning

explanations

catastrophic forgetting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tri-Level Rationale Distillation

Tabular Reasoning

Large Language Models