Optimizing VarLiNGAM for Scalable and Efficient Time Series Causal Discovery

📅 2024-09-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
VarLiNGAM, a prominent linear non-Gaussian causal discovery method for multivariate time series, suffers from high computational complexity—O(m³·n)—and poor scalability to high-dimensional settings. Method: We propose an efficient, scalable optimization framework featuring (i) a dedicated temporal causal data generator; (ii) algorithmic reformulation reducing VarLiNGAM’s complexity to O(m³ + m²·n); (iii) a hybrid causal inference mechanism integrating Independent Component Analysis (ICA) with lagged Granger causality testing; and (iv) CPU/GPU co-acceleration. Results: On large-scale synthetic time series (200–400 dimensions), our method achieves 7–13× speedup over the original VarLiNGAM and ≈4.5× over its GPU-accelerated variant, while maintaining high structural recovery accuracy across synthetic, real-world, and industrial datasets (healthcare and finance). This work constitutes the first systematic breakthrough in overcoming the scalability bottleneck of linear non-Gaussian methods for dynamic systems.

Technology Category

Application Category

📝 Abstract
Causal discovery identifies causal relationships in data, but the task is more complex for multivariate time series due to the computational demands of methods like VarLiNGAM, which combines a Vector Autoregressive Model with a Linear Non-Gaussian Acyclic Model. This study optimizes causal discovery specifically for time series data, which are common in practical applications. Time series causal discovery is particularly challenging because of temporal dependencies and potential time lag effects. By developing a specialized dataset generator and reducing the computational complexity of the VarLiNGAM model from ( O(m^3 cdot n) ) to ( O(m^3 + m^2 cdot n) ), this study enhances the feasibility of processing large datasets. The proposed methods were validated on advanced computational platforms and tested on simulated, real-world, and large-scale datasets, demonstrating improved efficiency and performance. The optimized algorithm achieved 7 to 13 times speedup compared to the original and about 4.5 times speedup compared to the GPU-accelerated version on large-scale datasets with feature sizes from 200 to 400. Our methods extend current causal discovery capabilities, making them more robust, scalable, and applicable to real-world scenarios, facilitating advancements in fields like healthcare and finance.
Problem

Research questions and friction points this paper is trying to address.

Scalable causal discovery in time-series data
Heuristic approximation for VarLiNGAM algorithm
Reducing computational complexity for large datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Heuristic approximation of VarLiNGAM algorithm
One-time precomputation of statistical values
Reduces time complexity to O(m^2n + m^3)
🔎 Similar Papers
No similar papers found.