TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning

📅 2025-09-07

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Current large language models (LLMs) struggle with complex numerical computation and fine-grained operations in tabular reasoning, while tool-integrated systems lack multi-step adaptive decision-making capabilities. To address this, we propose TableMind—an autonomous, programmable agent that integrates hierarchical planning, self-reflection, and sandboxed code execution to enable iterative tool invocation and dynamic policy adjustment. We introduce Rank-Aware Policy Optimization (RAPO), a novel reinforcement fine-tuning algorithm that dynamically weights trajectory updates based on ranking-ordered trajectory quality, thereby improving convergence toward exact answers. Our training paradigm comprises two stages: supervised fine-tuning on high-quality reasoning traces, followed by multi-objective reinforcement fine-tuning. TableMind achieves significant improvements over state-of-the-art methods across multiple benchmark tabular reasoning datasets, with substantial gains in both answer accuracy and computational precision—demonstrating the efficacy of autonomous, programmatic agents for complex tabular tasks.

Technology Category

Application Category

📝 Abstract

Table reasoning is crucial for leveraging structured data in domains such as finance, healthcare, and scientific research. While large language models (LLMs) show promise in multi-step reasoning, purely text-based methods often struggle with the complex numerical computations and fine-grained operations inherently required in this task. Tool-integrated reasoning improves computational accuracy via explicit code execution, yet existing systems frequently rely on rigid patterns, supervised imitation, and lack true autonomous adaptability. In this paper, we present TableMind, an LLM-driven table reasoning agent that (i) autonomously performs multi-turn tool invocation, (ii) writes and executes data-analyzing code in a secure sandbox environment for data analysis and precise numerical reasoning, and (iii) exhibits high-level capabilities such as planning and self-reflection to adapt strategies. To realize these capabilities, we adopt a two-stage fine-tuning paradigm built on top of a powerful pre-trained language model: supervised fine-tuning on high-quality reasoning trajectories to establish effective tool usage patterns, followed by reinforcement fine-tuning to optimize multi-objective strategies. In particular, we propose Rank-Aware Policy Optimization (RAPO), which increases the update weight of high-quality trajectories when their output probabilities are lower than those of low-quality ones, thereby guiding the model more consistently toward better and more accurate answers. Extensive experiments on several mainstream benchmarks demonstrate that TableMind achieves superior performance compared to competitive baselines, yielding substantial gains in both reasoning accuracy and computational precision.

Problem

Research questions and friction points this paper is trying to address.

Autonomous tool-augmented table reasoning for structured data

Overcoming limitations of text-based methods in numerical computations

Enhancing adaptability beyond rigid supervised imitation systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autonomous multi-turn tool invocation

Secure sandbox code execution

Rank-aware reinforcement learning optimization

🔎 Similar Papers

SheetAgent: Towards A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models