🤖 AI Summary
This work addresses key limitations of existing large language model (LLM) agents in tabular data analysis—namely, low transparency, inadequate support for multi-table comparison, and poor adaptation to user preferences. To overcome these challenges, the authors propose an interactive, self-evolving AI agent framework that leverages natural language instructions to drive table manipulation and reasoning. The framework incorporates verifiable workflows, explicit uncertainty signaling, user memory extraction, and a negative-feedback-driven skill distillation mechanism to enable personalized, continual improvement. By integrating ReAct-style tool invocation, multi-agent parallel reasoning, and modular skill importation, the system significantly enhances task completion rates and reasoning performance on standard benchmarks while ensuring full traceability and user control throughout the analytical process.
📝 Abstract
Spreadsheets and tables are widely used representations for structured data analysis, but effective analysis still requires substantial manual effort and domain expertise. Recent large language model (LLM) agents can automate parts of this process, but they often provide limited transparency into intermediate decisions, rely on implicit assumptions, struggle with multi-table comparison, and repeat similar workflows without adapting to a user's preferences. This paper presents TabClaw, an open-source interactive AI agent for spreadsheet manipulation and table reasoning. Users upload CSV or Excel files and issue natural-language requests; TabClaw clarifies ambiguous intent, exposes an editable execution plan, streams a ReAct-style tool-using analysis loop, dispatches specialist agents for parallel multi-table reasoning, and synthesizes findings with explicit consensus and uncertainty markers. Beyond one-off analysis, TabClaw records completed workflows, extracts persistent user memory, distills reusable skills from repeated tool-use patterns, supports package-style skill import, and upgrades skills from negative feedback. Experiments on spreadsheet manipulation and table reasoning benchmarks show that TabClaw improves executable task completion and reasoning performance while preserving an inspectable user workflow. This paper shows how TabClaw turns spreadsheets and tables into inspectable analytical workflows while gradually personalizing itself to recurring data-analysis tasks. Our code is available.