Wasserstein-p Central Limit Theorem Rates: From Local Dependence to Markov Chains

📅 2026-01-13

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates finite-sample convergence rates for the central limit theorem (CLT) in Wasserstein-p distance for multivariate dependent data, focusing on two canonical dependence structures: locally dependent sequences and geometrically ergodic Markov chains. By establishing a Wasserstein-1 Gaussian approximation error bound tailored to dependent data and proving that the regeneration times of geometrically ergodic Markov chains exhibit geometric tails—without requiring strong aperiodicity assumptions—the work achieves, for the first time, the optimal $O(n^{-1/2})$ convergence rate in $W_1$ distance. Under mild moment conditions, it further extends this result to obtain $W_p$-CLT rates for $p \geq 2$. As an application, the study derives the first optimal $W_1$-CLT rate for multivariate U-statistics under dependence, substantially improving upon existing theoretical upper bounds for Wasserstein CLT rates in dependent settings.

Technology Category

Application Category

📝 Abstract

Finite-time central limit theorem (CLT) rates play a central role in modern machine learning. In this paper, we study CLT rates for multivariate dependent data in Wasserstein-$p$ ($W_p$) distance, for general $p \geq 1$. We focus on two fundamental dependence structures that commonly arise in machine learning: locally dependent sequences and geometrically ergodic Markov chains. In both settings, we establish the first optimal $O(n^{-1/2})$ rate in $W_1$, as well as the first $W_p$ ($p\ge 2$) CLT rates under mild moment assumptions, substantially improving the best previously known bounds in these dependent-data regimes. As an application of our optimal $W_1$ rate for locally dependent sequences, we further obtain the first optimal $W_1$-CLT rate for multivariate $U$-statistics. On the technical side, we derive a tractable auxiliary bound for $W_1$ Gaussian approximation errors that is well suited for studying dependent data. For Markov chains, we further prove that the regeneration time of the split chain associated with a geometrically ergodic chain has a geometric tail without assuming strong aperiodicity or other restrictive conditions. These tools may be of independent interests and enable our optimal $W_1$ rates and underpin our $W_p$ ($p\ge 2$) results.

Problem

Research questions and friction points this paper is trying to address.

Wasserstein distance

Central Limit Theorem

Dependent data

Local dependence

Markov chains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Wasserstein distance

central limit theorem

dependent data

Markov chains

local dependence

🔎 Similar Papers

No similar papers found.

Authors to Follow