🤖 AI Summary
This paper addresses the lack of distribution-free theoretical guarantees for the *m*-out-of-*n* bootstrap in quantile estimation—particularly for studentized medians. Under mild moment conditions, it establishes, for the first time, a central limit theorem and an Edgeworth expansion for this method, without tuning parameters or assuming a specific underlying distribution. Methodologically, the analysis integrates sampling-without-replacement, studentization, Berry–Esseen bounds, and stochastic process techniques to derive precise convergence rates and explicit error bounds. Key contributions include: (1) the first distribution-free theoretical framework for bootstrap-based quantile inference; (2) proof of the tightness of the required moment conditions; (3) extension to modern learning settings—including MCMC (e.g., quantile inference for random-walk Metropolis–Hastings) and ergodic Markov decision processes (e.g., reward distribution estimation); and (4) theoretical validity under heavy-tailed data and robust statistical inference.
📝 Abstract
The m-out-of-n bootstrap, originally proposed by Bickel, Gotze, and Zwet (1992), approximates the distribution of a statistic by repeatedly drawing m subsamples (with m much smaller than n) without replacement from an original sample of size n. It is now routinely used for robust inference with heavy-tailed data, bandwidth selection, and other large-sample applications. Despite its broad applicability across econometrics, biostatistics, and machine learning, rigorous parameter-free guarantees for the soundness of the m-out-of-n bootstrap when estimating sample quantiles have remained elusive. This paper establishes such guarantees by analyzing the estimator of sample quantiles obtained from m-out-of-n resampling of a dataset of size n. We first prove a central limit theorem for a fully data-driven version of the estimator that holds under a mild moment condition and involves no unknown nuisance parameters. We then show that the moment assumption is essentially tight by constructing a counter-example in which the CLT fails. Strengthening the assumptions slightly, we derive an Edgeworth expansion that provides exact convergence rates and, as a corollary, a Berry Esseen bound on the bootstrap approximation error. Finally, we illustrate the scope of our results by deriving parameter-free asymptotic distributions for practical statistics, including the quantiles for random walk Metropolis-Hastings and the rewards of ergodic Markov decision processes, thereby demonstrating the usefulness of our theory in modern estimation and learning tasks.