TSQAgent: Rating Time Series Data Quality via Dedicated Agentic Reasoning

📅 2026-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

189K/year
🤖 AI Summary
Existing large language models struggle to automatically identify key quality dimensions in time series data and perform evidence-based quantitative comparisons. To address this limitation, this work proposes TSQAgent, a multi-agent collaborative framework that integrates three specialized roles—Perceiver, Inspector, and Adjudicator—to enable dimension-aware, tool-augmented quantitative analysis and holistic evaluation. Additionally, the study introduces TSQBench, the first benchmark specifically designed for time series quality assessment. Experimental results demonstrate that TSQAgent significantly enhances the model’s capacity for understanding and comparing time series quality on both TSQBench and eleven real-world datasets, leading to more effective data selection and improved performance in downstream tasks.
📝 Abstract
Assessing the quality of time series (TS) data is fundamental yet inherently challenging due to the multifaceted nature of quality dimensions. Recently, large language models (LLMs) have emerged as a promising paradigm for TS quality assessment via pairwise comparison and per-dimension evaluation. However, existing approaches rely on manually predefined quality dimensions and purely text-based reasoning, leaving it unknown whether LLMs can identify truly relevant quality dimensions or perform grounded and quantitative quality comparisons. To investigate this, we construct TSQBench, a dedicated benchmark for evaluating LLMs on two progressive capabilities: (i) understanding and identifying relevant quality dimensions, and (ii) performing quality comparison under specific dimensions. Our analysis reveals that current LLMs consistently struggle with both dimension identification and evidence-grounded quality comparison. To address these limitations, we propose TSQAgent, a novel agentic reasoning framework for TS quality rating consisting of three collaborative roles: Perceiver for focused dimension selection, Inspector for dimension-wise quantitative analysis, and Adjudicator that aggregates and refines the final judgment. In particular, we introduce an agentic reasoning strategy that instills the ability to identify and prioritize the most relevant quality dimensions, and further propose an agent workflow equipped with external analytical tools to enable precise quantitative comparisons over selected dimensions. Experiments on both the proposed benchmark and eleven real-world datasets demonstrate that our framework not only substantially improves LLMs' capabilities in quality understanding and quantitative comparison but also effectively translates these improvements into better quality-aware data selection, leading to enhanced downstream performance and data efficiency.
Problem

Research questions and friction points this paper is trying to address.

time series data quality
quality dimension identification
quantitative quality comparison
large language models
evidence-grounded reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic reasoning
time series quality assessment
large language models
quantitative comparison
quality-aware data selection
🔎 Similar Papers
No similar papers found.