🤖 AI Summary
This work addresses the limitations of existing methods in incorporating lengthy news articles into time series forecasting, which are constrained by context window sizes and suffer from redundant updates and slow convergence due to unguided iterative retrieval. To overcome these issues, the authors propose an importance-aware news compression framework coupled with a process reward model (PRM)-guided multi-round retrieval mechanism. The approach leverages an offline-trained importance reward model to allocate compression budgets, integrating sequential pairwise fusion with a frozen inference module to enable efficient, supervised incorporation of external information. Evaluated on financial, energy, transportation, and Bitcoin datasets, the method significantly outperforms strong baselines, drastically reduces the number of retrieval iterations, and maintains high accuracy and efficiency even when processing news inputs spanning thousands of tokens.
📝 Abstract
Incorporating news into time series forecasting is appealing because news can reveal abrupt exogenous events that historical values alone cannot recover. However, existing LLM-based news-forecasting pipelines face two practical limitations: relevant news articles often exceed the model's context window, and iterative retrieval of supplementary news is typically unguided, leading to redundant updates and slow convergence. We address these issues with a novel framework that combines importance-aware news compression and process-level retrieval supervision. First, we train an importance reward model that estimates the forecasting utility of each article and uses this signal to allocate compression budgets during sequential pairwise fusion, preserving informative content within a fixed context limit. Second, we introduce a process reward model (PRM) that ranks multiple supplementary-news candidates conditioned on the current error profile and the history of previously selected articles, replacing one-shot blind retrieval with quality-controlled selection. Both components are trained offline using historical data with ground truth; inference uses the frozen filtering logic and compression modules without any reflection loop. Experiments on finance, energy, traffic, and bitcoin forecasting benchmarks show that our method improves prediction accuracy over strong baselines, significantly reduces the number of refinement iterations compared to the iterative baseline, and remains effective when relevant articles span thousands of tokens.