🤖 AI Summary
This work systematically investigates the joint sensitivity of BiLSTM-based time series forecasting models to input sequence length and additive noise, elucidating their impact on model robustness and generalization. Through controlled experiments on multi-frequency real-world datasets, we establish a standardized, modular, and reproducible evaluation pipeline. Our key findings are: (1) excessive sequence length induces overfitting and data leakage; (2) additive noise consistently degrades prediction accuracy; and (3) the combined effect causes the most severe performance degradation—even high-frequency data, though comparatively more robust, remains vulnerable. The main contributions are: (1) empirical identification of the synergistic degradation effect between sequence length and noise; (2) proposal of a “data-aware modeling” paradigm, advocating pre-modeling analysis of data characteristics—especially under low-sample and high-noise regimes—to guide architecture design; and (3) open-sourcing an extensible evaluation framework to facilitate future research on robust time series modeling.
📝 Abstract
Deep learning (DL) models, a specialized class of multilayer neural networks, have become central to time-series forecasting in critical domains such as environmental monitoring and the Internet of Things (IoT). Among these, Bidirectional Long Short-Term Memory (BiLSTM) architectures are particularly effective in capturing complex temporal dependencies. However, the robustness and generalization of such models are highly sensitive to input data characteristics - an aspect that remains underexplored in existing literature. This study presents a systematic empirical analysis of two key data-centric factors: input sequence length and additive noise. To support this investigation, a modular and reproducible forecasting pipeline is developed, incorporating standardized preprocessing, sequence generation, model training, validation, and evaluation. Controlled experiments are conducted on three real-world datasets with varying sampling frequencies to assess BiLSTM performance under different input conditions. The results yield three key findings: (1) longer input sequences significantly increase the risk of overfitting and data leakage, particularly in data-constrained environments; (2) additive noise consistently degrades predictive accuracy across sampling frequencies; and (3) the simultaneous presence of both factors results in the most substantial decline in model stability. While datasets with higher observation frequencies exhibit greater robustness, they remain vulnerable when both input challenges are present. These findings highlight important limitations in current DL-based forecasting pipelines and underscore the need for data-aware design strategies. This work contributes to a deeper understanding of DL model behavior in dynamic time-series environments and provides practical insights for developing more reliable and generalizable forecasting systems.