🤖 AI Summary
This study addresses the theoretical gap in understanding why functional principal component analysis (FPCA) can severely fail when applied to rough functional data. The authors propose a model that explicitly characterizes data roughness and, for the first time, theoretically elucidate the mechanism through which roughness induces bias in FPCA. They further identify a phase-transition threshold governing the loss of information in FPCA. By integrating tools from random matrix theory, generic chaining techniques, and functional data analysis, they derive spectral statistics suitable for model diagnostics and goodness-of-fit testing. Comprehensive theoretical analysis, simulations, and empirical applications to climate and environmental data demonstrate that the proposed method accurately delineates the performance boundaries of FPCA and provides a practical diagnostic tool for assessing the validity of estimated principal components.
📝 Abstract
Functional data analysis is concerned with the analysis of infinite-dimensional data functions. Functional principal component analysis (FPCA) is a key method to obtain finite-dimensional summaries. Consistency of FPCA has been theoretically established for sufficiently regular data functions. However, empirical evidence shows that FPCA can become severely inconsistent when the underlying functions are too rough. This paper provides the first theoretical explanation for this phenomenon. We propose a model that explicitly captures the roughness of functional data and allows us to quantify the resulting bias of FPCA, depending on the functional roughness. The model undergoes a phase transition marking the point at which FPCA becomes entirely uninformative. Based on these probabilistic results, we discuss diagnostic tests for informative principal components. As an additional contribution, we derive results on spectral statistics that may serve as a foundation for goodness-of-fit tests for rough functional data. Mathematically, our approach combines recent advances in random matrix theory and generic chaining with tools from FDA. We illustrate the effects of roughness on FPCA using simulations, as well as climate and environmental datasets.