🤖 AI Summary
This paper addresses the asymptotic distribution of Chatterjee’s rank correlation coefficient for general (including dependent) random variable pairs. Motivated by the absence of a rigorous asymptotic theory beyond functional dependence, we establish its asymptotic normality under weak conditions and derive a uniform upper bound of 36 on its asymptotic variance—tight and estimable. Leveraging Hájek representation, Chatterjee’s recent nearest-neighbor central limit theorem, empirical process theory, and rank statistics, we construct a first-order variance estimator with uniform consistency. Our results naturally extend to the Azadkia–Chatterjee multivariate graph-based correlation coefficient. This work provides the first rigorous large-sample theoretical foundation for nonlinear dependence testing, bridging deep theoretical insights with practical methodological applicability.
📝 Abstract
Establishing the limiting distribution of Chatterjee's rank correlation for a general, possibly non-independent, pair of random variables has been eagerly awaited to many. This paper shows that (a) Chatterjee's rank correlation is asymptotically normal as long as one variable is not a measurable function of the other, (b) the corresponding asymptotic variance is uniformly bounded by 36, and (c) a consistent variance estimator exists. Similar results also hold for Azadkia-Chatterjee's graph-based correlation coefficient, a multivariate analogue of Chatterjee's original proposal. The proof is given by appealing to H'ajek representation and Chatterjee's nearest-neighbor CLT.