Addressing Challenges in Time Series Forecasting: A Comprehensive Comparison of Machine Learning Techniques

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This study addresses the challenge of degraded data quality—specifically outliers and missing values—in long-horizon time series forecasting, which critically undermines model robustness. We establish a unified evaluation framework to systematically benchmark mainstream models—including LSTM, Prophet, XGBoost, and Random Forest—under three realistic data conditions: complete, noisy (outlier-contaminated), and incomplete (missing-value) sequences, with ARIMA as the baseline. Methodologically, we employ sliding-window modeling, multi-step rolling prediction, and adaptive imputation for preprocessing. Our key contributions include: (i) a novel, interpretable algorithm selection guideline grounded in data characteristics and forecasting requirements; and (ii) empirical findings demonstrating that XGBoost reduces average MAE by 23% under noise, while Prophet exhibits superior stability for long-term trend forecasting. The results provide reproducible, principled guidance for industrial-scale time series modeling.

Technology Category

Application Category

📝 Abstract

The explosion of Time Series (TS) data, driven by advancements in technology, necessitates sophisticated analytical methods. Modern management systems increasingly rely on analyzing this data, highlighting the importance of effcient processing techniques. State-of-the-art Machine Learning (ML) approaches for TS analysis and forecasting are becoming prevalent. This paper briefly describes and compiles suitable algorithms for TS regression task. We compare these algorithms against each other and the classic ARIMA method using diverse datasets: complete data, data with outliers, and data with missing values. The focus is on forecasting accuracy, particularly for long-term predictions. This research aids in selecting the most appropriate algorithm based on forecasting needs and data characteristics.

Problem

Research questions and friction points this paper is trying to address.

Comparing ML techniques for time series forecasting accuracy

Evaluating algorithms on complete, outlier, and missing value datasets

Selecting optimal TS forecasting method based on data characteristics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compares ML algorithms for time series forecasting

Evaluates performance with outliers and missing data

Focuses on long-term prediction accuracy

🔎 Similar Papers

Macroeconomic Forecasting with Large Language Models