🤖 AI Summary
Modeling high-dimensional nonlinear vector autoregressive (VAR) processes remains challenging due to complex dependencies and curse-of-dimensionality.
Method: This paper proposes a nonparametric approach based on sparse additive models: nonlinear dynamics are represented via basis function expansions, while structural learning and parameter estimation are achieved through sparse regularization combined with least-squares optimization.
Contribution/Results: Theoretically, it establishes the first unified framework incorporating both non-Gaussianity and nonlinearity for VAR models, deriving sharp Bernstein-type inequalities for dependent processes—enabling tight tail-probability control for both sub-Gaussian and non-sub-Gaussian linear and nonlinear VARs. It further proves optimal convergence rates and model selection consistency of the estimator. Numerical experiments demonstrate substantial improvements in forecasting accuracy and key variable identification on real-world time series, including gene expression data. The method is broadly applicable across economics, computational biology, and climate science.
📝 Abstract
High-dimensional vector autoregressive (VAR) models have numerous applications in fields such as econometrics, biology, climatology, among others. While prior research has mainly focused on linear VAR models, these approaches can be restrictive in practice. To address this, we introduce a high-dimensional non-parametric sparse additive model, providing a more flexible framework. Our method employs basis expansions to construct high-dimensional nonlinear VAR models. We derive convergence rates and model selection consistency for least squared estimators, considering dependence measures of the processes, error moment conditions, sparsity, and basis expansions. Our theory significantly extends prior linear VAR models by incorporating both non-Gaussianity and non-linearity. As a key contribution, we derive sharp Bernstein-type inequalities for tail probabilities in both non-sub-Gaussian linear and nonlinear VAR processes, which match the classical Bernstein inequality for independent random variables. Additionally, we present numerical experiments that support our theoretical findings and demonstrate the advantages of the nonlinear VAR model for a gene expression time series dataset.