🤖 AI Summary
This paper addresses uncertainty quantification in regression for functional and high-dimensional complex responses (e.g., functions, random graphs) in separable Hilbert spaces. We propose a model-free, conditional depth-based method for constructing predictive regions. Our approach innovatively integrates kernel mean embedding with integral probability metrics to establish the first computationally tractable conditional depth framework. We further introduce a conformalized variant that yields predictive/tolerance regions with finite-sample, non-asymptotic marginal coverage guarantees. Theoretically, we establish both conditional and unconditional consistency as well as optimal convergence rates. In simulations across diverse functional and Euclidean data settings, our method significantly improves coverage accuracy and region compactness. Applied to digital health motion behavior analysis, it enables reliable, individualized uncertainty characterization for personalized recommendations.
📝 Abstract
Depth measures are powerful tools for defining level sets in emerging, non--standard, and complex random objects such as high-dimensional multivariate data, functional data, and random graphs. Despite their favorable theoretical properties, the integration of depth measures into regression modeling to provide prediction regions remains a largely underexplored area of research. To address this gap, we propose a novel, model-free uncertainty quantification algorithm based on conditional depth measures--specifically, conditional kernel mean embeddings and an integrated depth measure. These new algorithms can be used to define prediction and tolerance regions when predictors and responses are defined in separable Hilbert spaces. The use of kernel mean embeddings ensures faster convergence rates in prediction region estimation. To enhance the practical utility of the algorithms with finite samples, we also introduce a conformal prediction variant that provides marginal, non-asymptotic guarantees for the derived prediction regions. Additionally, we establish both conditional and unconditional consistency results, as well as fast convergence rates in certain homoscedastic settings. We evaluate the finite--sample performance of our model in extensive simulation studies involving various types of functional data and traditional Euclidean scenarios. Finally, we demonstrate the practical relevance of our approach through a digital health application related to physical activity, aiming to provide personalized recommendations