🤖 AI Summary
This work proposes a deep bootstrap method based on conditional diffusion models to address the limitations of traditional bootstrap approaches in nonparametric regression, which decouple conditional distribution estimation from the regression task and struggle with high-dimensional or multimodal responses. By integrating conditional diffusion mechanisms into the bootstrap framework for the first time, the method jointly learns the conditional distribution and generates synthetic responses, reframing nonparametric regression as conditional sample mean estimation within an end-to-end unified model. The approach achieves both high accuracy and scalability in complex regression settings and establishes theoretical guarantees for the bootstrap procedure, including optimal convergence rates under the Wasserstein distance.
📝 Abstract
In this work, we propose a novel deep bootstrap framework for nonparametric regression based on conditional diffusion models. Specifically, we construct a conditional diffusion model to learn the distribution of the response variable given the covariates. This model is then used to generate bootstrap samples by pairing the original covariates with newly synthesized responses. We reformulate nonparametric regression as conditional sample mean estimation, which is implemented directly via the learned conditional diffusion model. Unlike traditional bootstrap methods that decouple the estimation of the conditional distribution, sampling, and nonparametric regression, our approach integrates these components into a unified generative framework. With the expressive capacity of diffusion models, our method facilitates both efficient sampling from high-dimensional or multimodal distributions and accurate nonparametric estimation. We establish rigorous theoretical guarantees for the proposed method. In particular, we derive optimal end-to-end convergence rates in the Wasserstein distance between the learned and target conditional distributions. Building on this foundation, we further establish the convergence guarantees of the resulting bootstrap procedure. Numerical studies demonstrate the effectiveness and scalability of our approach for complex regression tasks.