From On-chain to Macro: Assessing the Importance of Data Source Diversity in Cryptocurrency Market Forecasting

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically investigates the impact of data source diversity on cryptocurrency price forecasting performance. To address this, we propose the Crypto100 index and develop a domain-aware feature selection and dimensionality reduction framework tailored to heterogeneous multi-source data—including on-chain metrics, technical indicators, sentiment signals, traditional financial market data, and macroeconomic variables—integrated within a rolling-window modeling framework and evaluated via LSTM and multivariate regression benchmarks. Our key empirical contribution is the first identification that on-chain features contribute over 40% to short-horizon forecasting accuracy, establishing their centrality; conversely, macroeconomic and traditional market indicators exhibit markedly increasing importance with longer forecast horizons. Experimental results demonstrate that the proposed methodology reduces mean absolute error (MAE) by an average of 18.7% across short-, medium-, and long-term forecasting tasks, significantly enhancing model generalizability and stability.

Technology Category

Application Category

📝 Abstract
This study investigates the impact of data source diversity on the performance of cryptocurrency forecasting models by integrating various data categories, including technical indicators, on-chain metrics, sentiment and interest metrics, traditional market indices, and macroeconomic indicators. We introduce the Crypto100 index, representing the top 100 cryptocurrencies by market capitalization, and propose a novel feature reduction algorithm to identify the most impactful and resilient features from diverse data sources. Our comprehensive experiments demonstrate that data source diversity significantly enhances the predictive performance of forecasting models across different time horizons. Key findings include the paramount importance of on-chain metrics for both short-term and long-term predictions, the growing relevance of traditional market indices and macroeconomic indicators for longer-term forecasts, and substantial improvements in model accuracy when diverse data sources are utilized. These insights help demystify the short-term and long-term driving factors of the cryptocurrency market and lay the groundwork for developing more accurate and resilient forecasting models.
Problem

Research questions and friction points this paper is trying to address.

Impact of diverse data sources on cryptocurrency forecasting models
Identifying key features from varied data for market predictions
Enhancing model accuracy with multi-source data integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates diverse data sources for crypto forecasting
Introduces Crypto100 index for market representation
Proposes novel feature reduction algorithm
🔎 Similar Papers
No similar papers found.
G
Giorgos Demosthenous
University of Cyprus, Nicosia, Cyprus
Chryssis Georgiou
Chryssis Georgiou
Professor of Computer Science, University of Cyprus
Distributed ComputingAlgorithmsTheory
E
Eliada Polydorou
University of Cyprus, Nicosia, Cyprus