🤖 AI Summary
We address variable selection and prediction in high-dimensional functional data regression—arising, e.g., in high-frequency finance and neuroimaging. We propose a group elastic net regularization method formulated within a reproducing kernel Hilbert space (RKHS) framework. Theoretically, we establish the first non-asymptotic variable selection consistency result and prove that, under a functional irrepresentable condition, the post-selection estimator achieves the oracle minimax optimal prediction rate. Technically, we integrate Gateaux subdifferentiability analysis with the RKHS structure to rigorously characterize structural dependencies among high-dimensional functional predictors. Extensive simulations and real-data analysis of the Human Connectome Project (HCP) demonstrate the method’s effectiveness in identifying salient functional variables and improving predictive accuracy. Our approach provides a new paradigm for high-dimensional functional regression that simultaneously ensures statistical guarantees and computational feasibility.
📝 Abstract
High-dimensional functional data have become increasingly prevalent in modern applications such as high-frequency financial data and neuroimaging data analysis. We investigate a class of high-dimensional linear regression models, where each predictor is a random element in an infinite-dimensional function space, and the number of functional predictors $p$ can potentially be ultra-high. Assuming that each of the unknown coefficient functions belongs to some reproducing kernel Hilbert space (RKHS), we regularize the fitting of the model by imposing a group elastic-net type of penalty on the RKHS norms of the coefficient functions. We show that our loss function is Gateaux sub-differentiable, and our functional elastic-net estimator exists uniquely in the product RKHS. Under suitable sparsity assumptions and a functional version of the irrepresentable condition, we derive a non-asymptotic tail bound for variable selection consistency of our method. Allowing the number of true functional predictors $q$ to diverge with the sample size, we also show a post-selection refined estimator can achieve the oracle minimax optimal prediction rate. The proposed methods are illustrated through simulation studies and a real-data application from the Human Connectome Project.