GTR: Graph-Table-RAG for Cross-Table Question Answering

📅 2025-04-02

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

To address the challenge of multi-table knowledge integration in cross-table question answering—exacerbated by scarce real-world training data—this paper proposes Graph-Table-RAG, a novel retrieval-augmented generation framework. It pioneers heterogeneous table modeling as a semantic graph and synergistically integrates graph neural network–enhanced hierarchical retrieval (coarse-grained table-level → fine-grained cell-level) with graph-structure-guided LLM prompting for efficient and faithful reasoning. Key contributions include: (1) MultiTableQA, the first large-scale, realistic multi-table QA benchmark comprising 60K tables and 25K natural-language questions; (2) a new graph-table co-retrieval RAG paradigm that unifies structural and semantic knowledge; and (3) state-of-the-art performance achieving optimal trade-offs between accuracy and latency—outperforming prior methods by significant margins and demonstrating strong viability for industrial deployment.

Technology Category

Application Category

📝 Abstract

Beyond pure text, a substantial amount of knowledge is stored in tables. In real-world scenarios, user questions often require retrieving answers that are distributed across multiple tables. GraphRAG has recently attracted much attention for enhancing LLMs' reasoning capabilities by organizing external knowledge to address ad-hoc and complex questions, exemplifying a promising direction for cross-table question answering. In this paper, to address the current gap in available data, we first introduce a multi-table benchmark, MutliTableQA, comprising 60k tables and 25k user queries collected from real-world sources. Then, we propose the first Graph-Table-RAG framework, namely GTR, which reorganizes table corpora into a heterogeneous graph, employs a hierarchical coarse-to-fine retrieval process to extract the most relevant tables, and integrates graph-aware prompting for downstream LLMs' tabular reasoning. Extensive experiments show that GTR exhibits superior cross-table question-answering performance while maintaining high deployment efficiency, demonstrating its real-world practical applicability.

Problem

Research questions and friction points this paper is trying to address.

Develops a benchmark for cross-table question answering

Proposes a Graph-Table-RAG framework for multi-table queries

Enhances LLMs' reasoning with graph-aware prompting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reorganizes tables into heterogeneous graph

Hierarchical coarse-to-fine retrieval process

Graph-aware prompting for LLMs reasoning

🔎 Similar Papers

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering