๐ค AI Summary
This study addresses the limitation of current large language models in supporting future-oriented numerical forecasting within open-domain table question answering, a gap exacerbated by the absence of dedicated datasets and methodologies. To bridge this gap, the work introduces the first open-domain table QA task specifically designed for future prediction and constructs the inaugural real-world time-series table QA benchmark dataset grounded in the real estate domain. Furthermore, it proposes TimeFore, a multi-agent framework that integrates SQL-based retrieval, external time-series forecasting models, and large language model reasoning through three synergistic modulesโretrieval, prediction, and analysis. Experimental results demonstrate that TimeFore significantly enhances both prediction accuracy and answer consistency, offering an effective solution for future-oriented table question answering.
๐ Abstract
The rapid development of LLMs has significantly advanced tabular question answering, but most systems cannot perform future-oriented numerical prediction. To address this gap, we introduce a novel task, Open-Domain Tabular Question Answering for Future Data Forecasting and Reasoning, and propose the first dataset to cover time-series forecasting and forecast-based reasoning scenarios using real estate data. This task poses challenges in retrieving precise historical data, overcoming the forecasting limitations of LLMs, and standardizing responses for diverse queries. To solve the above challenges, we propose TimeFore, an LLM agent-based framework that decomposes the problem into three collaborative roles: a Retriever autonomously generates SQL to fetch data, a Forecaster invokes external time-series models for higher accuracy, and an Analyzer synthesizes the results to construct a precise and consistent final answer. Extensive experiments demonstrate the effectiveness of our TimeFore.